
Hi, the last few days, I tried to get an IO-Event system running with GHC i.e. trigger an IO action when there is data to read from a fd. I looked at a few different implementations, but all of them have some downside. * using select package - This uses the select syscall. select is rather limited (fd cannot be
1024)
From the above list, GHC.Event isn't usable (for me) right now. It would require some work for my usecase. The select option is usable, but suffers from the same problems as poll +
* using GHC.Event - GHC.Event is broken in 7.10.1 (unless unsafeCoerce and a hacky trick are used) - GHC.Event is GHC internal according to hackage - Both Network libraries I looked at (networking (Network.Socket) and socket (System.Socket)) crash the application with GHC.Event - with 7.8+ I didn't see a way to create your own EventManager, so it only works with -threaded * using forkIO and threadWaitRead for each fd in a loop - needs some kind of custom control structure around it - uses a separate thread for each fd - might become pretty awkward to handle multiple events * using poll package - blocks in a safe foreign call - needs some kind of wrapper the limitation mentioned, so it is strictly worse. This leaves me with two options: poll and forkIO + blocking. Those are based on two completely different approaches to event handling.. poll can be used in a rather classic event handling system with a main loop that blocks until an event occurs (or a timeout triggers) and handles the event in the loop. forkIO + blocking is closer to registering an action later that should be triggered by an event. My main questions right now are: 1. How bad is it for the (non-threaded) runtime to be blocking in a foreign call most of the time? 2. How significant will the overhead be for the forkIO version? 3. Is there a *good* way to use something like threadWaitRead that allows to wake up on other events as well? 4. Is there a better way to handle multiple fds that may get readable data at any time, in Haskell/with GHC right now? Thanks in advance, Ongy

Hi,
Why the non-threaded runtime, out of interest?
Threads forked with forkIO are pretty lightweight, and although things look
like blocking calls from the Haskell point of view, as I understand it
under the hood it's all done with events of one form or another. Thus even
with the non-threaded runtime you will see forkIO-threads behaving as if
they're running concurrently. In particular, you have two threads blocked
trying to read from two different Handles and each will be awoken just when
there's data to read, and the rest of the runtime will carry on even while
they're blocked. Try it!
If you're dealing with FDs that you've acquired from elsewhere, the
function unix:System.Posix.IO.ByteString.fdToHandle can be used to import
them and then they work like normal Handles in terms of blocking operations
etc.
Whenever I've had to deal with waking up for one of a number of reasons
(not all of which are FDs) I've found the simplicity of STM is hard to
beat. Something like:
atomically ((Left <$> waitForFirstThing) <|> (Right <$> waitForSecondThing))
where waitForFirstThing and waitForSecondThing are blocked waiting for
something interesting to occur in a TVar that they're watching. It's so
simple that I reckon it's worth doing it like that and only trying
something more complicated if it turns out from experimentation that this
has too much overhead for you - "make it right" precedes "make it fast".
Hope that helps,
David
On 7 October 2015 at 08:49, Markus Ongyerth
Hi,
the last few days, I tried to get an IO-Event system running with GHC i.e. trigger an IO action when there is data to read from a fd. I looked at a few different implementations, but all of them have some downside.
* using select package - This uses the select syscall. select is rather limited (fd cannot be
1024)
* using GHC.Event - GHC.Event is broken in 7.10.1 (unless unsafeCoerce and a hacky trick are used) - GHC.Event is GHC internal according to hackage - Both Network libraries I looked at (networking (Network.Socket) and socket (System.Socket)) crash the application with GHC.Event - with 7.8+ I didn't see a way to create your own EventManager, so it only works with -threaded
* using forkIO and threadWaitRead for each fd in a loop - needs some kind of custom control structure around it - uses a separate thread for each fd - might become pretty awkward to handle multiple events
* using poll package - blocks in a safe foreign call - needs some kind of wrapper
From the above list, GHC.Event isn't usable (for me) right now. It would require some work for my usecase. The select option is usable, but suffers from the same problems as poll + the limitation mentioned, so it is strictly worse.
This leaves me with two options: poll and forkIO + blocking.
Those are based on two completely different approaches to event handling..
poll can be used in a rather classic event handling system with a main loop that blocks until an event occurs (or a timeout triggers) and handles the event in the loop. forkIO + blocking is closer to registering an action later that should be triggered by an event.
My main questions right now are: 1. How bad is it for the (non-threaded) runtime to be blocking in a foreign call most of the time? 2. How significant will the overhead be for the forkIO version? 3. Is there a *good* way to use something like threadWaitRead that allows to wake up on other events as well? 4. Is there a better way to handle multiple fds that may get readable data at any time, in Haskell/with GHC right now?
Thanks in advance, Ongy
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

2015-10-07 18:30 GMT+02:00 David Turner
Hi,
Why the non-threaded runtime, out of interest?
Mostly because i am used to the poll/select method I mentioned and that one works without any threading. I don't really mind using the threaded runtime though, it's more habit.
Threads forked with forkIO are pretty lightweight, and although things look like blocking calls from the Haskell point of view, as I understand it under the hood it's all done with events of one form or another. Thus even with the non-threaded runtime you will see forkIO-threads behaving as if they're running concurrently. In particular, you have two threads blocked trying to read from two different Handles and each will be awoken just when there's data to read, and the rest of the runtime will carry on even while they're blocked. Try it!
Yeah, I know and I tried that. As far as I can see, that's actually why things break with GHC.Event. The Event system tries to register the Fd while it was registered by me and encounters an EEXIST from epoll.
If you're dealing with FDs that you've acquired from elsewhere, the function unix:System.Posix.IO.ByteString.fdToHandle can be used to import them and then they work like normal Handles in terms of blocking operations etc.
Whenever I've had to deal with waking up for one of a number of reasons (not all of which are FDs) I've found the simplicity of STM is hard to beat. Something like:
atomically ((Left <$> waitForFirstThing) <|> (Right <$> waitForSecondThing))
Looks like I should look up STM. Does this scale easily? I don't really need huge amounts, but I don't have any knowledge about the number of Fds I will have.
where waitForFirstThing and waitForSecondThing are blocked waiting for something interesting to occur in a TVar that they're watching. It's so simple that I reckon it's worth doing it like that and only trying something more complicated if it turns out from experimentation that this has too much overhead for you - "make it right" precedes "make it fast".
Hope that helps,
David
Thanks for the help, Ongy

On Wed, Oct 7, 2015 at 10:16 AM, Markus Ongyerth
Mostly because i am used to the poll/select method I mentioned and that one works without any threading. I don't really mind using the threaded runtime though, it's more habit.
The stock stuff in the threaded runtime uses epoll() out of the box. When
you call hRead on a Handle, if the handle would block then you ultimately
get a call to threadWaitRead
https://hackage.haskell.org/package/base-4.8.1.0/docs/GHC-Conc.html#v:thread...
or
threadWaitWrite; these functions register interest in the given file
descriptor, and the IO manager / GHC runtime scheduler will wake up your
thread (GHC uses "green" threads) when the file descriptor becomes writable.
G
--
Gregory Collins

On 7 October 2015 at 18:16, Markus Ongyerth
2015-10-07 18:30 GMT+02:00 David Turner
: Hi,
Why the non-threaded runtime, out of interest?
Mostly because i am used to the poll/select method I mentioned and that one works without any threading. I don't really mind using the threaded runtime though, it's more habit.
Threads forked with forkIO are pretty lightweight, and although things look like blocking calls from the Haskell point of view, as I understand it under the hood it's all done with events of one form or another. Thus even with the non-threaded runtime you will see forkIO-threads behaving as if they're running concurrently. In particular, you have two threads blocked trying to read from two different Handles and each will be awoken just when there's data to read, and the rest of the runtime will carry on even while they're blocked. Try it!
Yeah, I know and I tried that. As far as I can see, that's actually why things break with GHC.Event. The Event system tries to register the Fd while it was registered by me and encounters an EEXIST from epoll.
Ah, ok, so you can either do your epolling through the Haskell runtime or with your bare hands but you can't do both on a single FD.
If you're dealing with FDs that you've acquired from elsewhere, the function
unix:System.Posix.IO.ByteString.fdToHandle can be used to import them and then they work like normal Handles in terms of blocking operations etc.
Whenever I've had to deal with waking up for one of a number of reasons (not all of which are FDs) I've found the simplicity of STM is hard to beat. Something like:
atomically ((Left <$> waitForFirstThing) <|> (Right <$> waitForSecondThing))
Looks like I should look up STM. Does this scale easily? I don't really need huge amounts, but I don't have any knowledge about the number of Fds I will have.
Waiting on arbitrarily many things is pretty much as simple (as long as they all have the same type so you can put them in a list): atomically (asum listOfWaitingThings) In terms of code complexity that scales just fine! I'm afraid I've no real idea what the performance characteristics of such a device would be without trying it out in your use case. Whenever I've been doing this kind of thing I've always found myself IO-bound rather than CPU-bound so I've never found myself worrying too much about the efficiency of the code itself. If you're used to doing select/poll things yourself then it may help to think of Haskell threads blocking on Handles as basically a way to do an epoll-based event loop on the underlying FDs but with a much nicer syntax and less mucking around with explicit continuations. Similarly, if you're used to dealing with task scheduling at a low level then it may help to think of STM transactions blocking as a way to muck around with the run queues in the scheduler but with a much nicer syntax and less mucking around with explicit continuations. Best wishes, David

2015-10-07 21:17 GMT+02:00 David Turner
On 7 October 2015 at 18:16, Markus Ongyerth
wrote: 2015-10-07 18:30 GMT+02:00 David Turner
: Hi,
Why the non-threaded runtime, out of interest?
Mostly because i am used to the poll/select method I mentioned and that one works without any threading. I don't really mind using the threaded runtime though, it's more habit.
Threads forked with forkIO are pretty lightweight, and although things look like blocking calls from the Haskell point of view, as I understand it under the hood it's all done with events of one form or another. Thus even with the non-threaded runtime you will see forkIO-threads behaving as if they're running concurrently. In particular, you have two threads blocked trying to read from two different Handles and each will be awoken just when there's data to read, and the rest of the runtime will carry on even while they're blocked. Try it!
Yeah, I know and I tried that. As far as I can see, that's actually why things break with GHC.Event. The Event system tries to register the Fd while it was registered by me and encounters an EEXIST from epoll.
Ah, ok, so you can either do your epolling through the Haskell runtime or with your bare hands but you can't do both on a single FD.
Ah, I didn't to it with bare hands, I did it with GHC.Event registerFd. Running my own epoll might work (according to the epoll man page), but I really don't want to do that.
If you're dealing with FDs that you've acquired from elsewhere, the function unix:System.Posix.IO.ByteString.fdToHandle can be used to import them and then they work like normal Handles in terms of blocking operations etc.
Whenever I've had to deal with waking up for one of a number of reasons (not all of which are FDs) I've found the simplicity of STM is hard to beat. Something like:
atomically ((Left <$> waitForFirstThing) <|> (Right <$> waitForSecondThing))
Looks like I should look up STM. Does this scale easily? I don't really need huge amounts, but I don't have any knowledge about the number of Fds I will have.
Waiting on arbitrarily many things is pretty much as simple (as long as they all have the same type so you can put them in a list):
atomically (asum listOfWaitingThings)
Oh, I didn't see asum, but "came up" with the same implementation.
In terms of code complexity that scales just fine! I'm afraid I've no real idea what the performance characteristics of such a device would be without trying it out in your use case. Whenever I've been doing this kind of thing I've always found myself IO-bound rather than CPU-bound so I've never found myself worrying too much about the efficiency of the code itself.
If you're used to doing select/poll things yourself then it may help to think of Haskell threads blocking on Handles as basically a way to do an epoll-based event loop on the underlying FDs but with a much nicer syntax and less mucking around with explicit continuations. Similarly, if you're used to dealing with task scheduling at a low level then it may help to think of STM transactions blocking as a way to muck around with the run queues in the scheduler but with a much nicer syntax and less mucking around with explicit continuations.
For my current project the speed does not really matter, but I tend to do some research anyway, since I might get to a point where I need it. The one thing I am not sure about right now, is how to use threadWaitReadSTM. Can I reuse the STM? I have two Fds I can test it with right now, and one of them works, the other one doesn't seem to work for me. I looked into the source and to me it looks like the STM should not be reused, since the content of the TVar used internally will be set to True. Thanks for the help, ongy
participants (3)
-
David Turner
-
Gregory Collins
-
Markus Ongyerth