Problem with select(2) in single threaded runtime.

While adding a test case for https://phabricator.haskell.org/D407 I noticed that while my initial patch fixed the crash for the threaded runtime, the single threaded runtime was still affected. I dove into the RTS and have hit a snag. In awaitEvent (rts/posix/Select.c) select(2) is called with the timeout computed from the Haskell call. On my current (OSX) machine my test case overflows "struct timeval", causing select to return EINVAL and crash the runtime. Unfortunately, there appears to be no portable to find the maximum size of time_t/suseconds_t (the types of the struct timeval fields), and therefore no portable way to avoid this overflow. The most practical thing I can think of is to add a configure case that checks sizeof(time_t) and sizeof(suseconds_t) and guesses they're just unsigned values of the relevant type, but I'm open to better suggestions. Alternatively, the solution is to hardcode the max value for every platform, in which case I avoid you all to tell me the maximum value on your specific platform :p Cheers, Merijn

Worse, it appears the maximum size of time_t is unrelated to the maximum value accepted by select(2). After diving into my system's header files time_t appears to be 'long', but a timeout of INT_MAX seconds already triggers "EINVAL". At this point, the only option I see is to verify the max timeout for every platform, by reading the source and/or trying all possible timeouts until the max is found and hard coding these max values using CPP. If anyone has a better solution, you're more than welcome to enlighten me! Cheers, Merijn
On 3 Nov 2014, at 21:36, Merijn Verstraaten
wrote: While adding a test case for https://phabricator.haskell.org/D407 I noticed that while my initial patch fixed the crash for the threaded runtime, the single threaded runtime was still affected. I dove into the RTS and have hit a snag.
In awaitEvent (rts/posix/Select.c) select(2) is called with the timeout computed from the Haskell call. On my current (OSX) machine my test case overflows "struct timeval", causing select to return EINVAL and crash the runtime. Unfortunately, there appears to be no portable to find the maximum size of time_t/suseconds_t (the types of the struct timeval fields), and therefore no portable way to avoid this overflow.
The most practical thing I can think of is to add a configure case that checks sizeof(time_t) and sizeof(suseconds_t) and guesses they're just unsigned values of the relevant type, but I'm open to better suggestions. Alternatively, the solution is to hardcode the max value for every platform, in which case I avoid you all to tell me the maximum value on your specific platform :p
Cheers, Merijn _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On 2014-11-04 at 07:42:11 +0100, Merijn Verstraaten wrote:
Worse, it appears the maximum size of time_t is unrelated to the maximum value accepted by select(2).
Fwiw, I found the following on http://pubs.opengroup.org/onlinepubs/009695399/functions/pselect.html ,---- | Implementations may place limitations on the maximum timeout interval | supported. All implementations shall support a maximum timeout interval | of at least 31 days. If the timeout argument specifies a timeout | interval greater than the implementation-defined maximum value, the | maximum value shall be used as the actual timeout value. Implementations | may also place limitations on the granularity of timeout intervals. If | the requested timeout interval requires a finer granularity than the | implementation supports, the actual timeout interval shall be rounded up | to the next supported value. `---- So I'm a bit surprised you get an EINVAL instead of the timeout being capped to the implementation's max supported value. Was the passed `timeval` structure valid? I.e. is the tv_usec value inside the [0,1e6-1] range? and was tv_sec>=0 ?

Hi Herbert,
On 4 Nov 2014, at 0:34, Herbert Valerio Riedel
wrote: Was the passed `timeval` structure valid? I.e. is the tv_usec value inside the [0,1e6-1] range? and was tv_sec>=0 ?
The simple test I wrote is:
#include

On 2014-11-04 at 18:18:44 +0100, Merijn Verstraaten wrote: [...]
This exits with EINVAL for me on OSX, if I replace INT_MAX with 1000, it runs just fine. The man page on OSX mentions EINVAL for values that exceed the maximum timeout, so it looks like OSX is not following the spec, then...
Btw, I also stumbled over these ancient bug-reports: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=18909 http://gnats.netbsd.org/11287 So it seems that BSD-ish systems have (or had?) this arbitrary 1e8 second limit in combination with the questionable EINVAL response which seems in conflict with the POSIX specification. I'm wondering if there's already an Autoconf test somewhere we could steal for detecting this peculiarity of select() on BSD systems... Cheers, hvr

On 4 Nov 2014, at 9:32, Herbert Valerio Riedel
I'm wondering if there's already an Autoconf test somewhere we could steal for detecting this peculiarity of select() on BSD systems...
I googled around for ways to detect this, but I haven't found anything so far. I've been scarred by autotools, to if someone less horrified knows how to do this "properly", please chime in :) Cheers, Merijn
participants (2)
-
Herbert Valerio Riedel
-
Merijn Verstraaten