Bringing Erlang to Haskell

newer
Fwd: [Haskell] ANNOUNCE: Process...

older
trick to easily generate Eq/Ord...

Joel Reymont

12 Dec 2005 12 Dec '05

4 p.m.

Folks, I love the Erlang multi-processing experience and think that a lot of the mistakes that I made could be avoided. What I want to have is 1) Processes, aka threads with single-slot in/out mailboxes 2) A facility to keep a list of such processes and send events to them using their process id 3) A socket reader/writer abstraction that communicates with the outside world using using its mailboxes Probably some other things but I would start with the above. I also want to use STM for this. One particular thing that bugs me is that I cannot really use TChan for thread mailboxes. I don't think I experienced this problem with Erlang but using a TChan with a logger thread quickly overwhelms the logger and fills the TChan and a lot (hundreds? thousands) of other threads are logging to it. Someone said it's because the scheduler would give ther other threads proportionally more attention. I found single-slot mailboxes (TMVar) to work much better as they pace the overall message flow. Using them means that asynchronous messages cannot be implemented, though. Please correct me if I'm wrong. I'll update my blog as I move forward. Thanks, Joel -- http://wagerlabs.com/

Show replies by date

Joel Reymont

12 Dec 12 Dec

4:18 p.m.

Just to elaborate a bit... On Dec 12, 2005, at 4:00 PM, Joel Reymont wrote:

...

1) Processes, aka threads with single-slot in/out mailboxes

It's obvious that the thread id and the mailboxes (two TMVar's) would need to be kept together, i.e. a) data Process = ThreadId (TMVar Dynamic) (TMVar Dynamic) or b) data Process a = ThreadId (TMVar a) (TMVar a) but I'm not sure if I should create a process class, for example. I think a class is not really needed since a message "receive" function can just use myThreadId to obtain a key into the map of processes and retrieve the message from its own mailbox. receive could take an action that's parameterized on the event and could pattern-match on it to process the event. This is why I would much rather prefer (b) above. I foresee myself having a single Event type throughout my app with various constructors. Processes can be supervised by rethrowing exceptions back to the parent thread I think.

...

2) A facility to keep a list of such processes and send events to them using their process id

I'm looking to use a global variable for this but then events would either need to be Dynamic or great care would need to be excercised with unsafeIO (Tomasz? ;-)) I still think this is the way to tackle the issue (note Process a). {-# NOINLINE children #-} children :: MVar [Process a] children = unsafePerformIO $ newMVar [] So long as everything is parameterized throughout, my experience is that the type checker complains if you try to use two types of 'a' throughout the app.

...

3) A socket reader/writer abstraction that communicates with the outside world using using its mailboxes

What I would like to do here is to use STM to read either from the socket loop mailbox or from the socket itself and then dispatch accordingly. If the message came from the socket then it's posted to the outbox and if a message was retrieved from the inbox then it gets posted to the socket. Some sort of a timeout would be needed for blocking I/O or a combination of hWaitForInput and -threaded. Thanks, Joel P.S. Einar apparently has distributed transactions almost ready. Putting them together with a polished version of the above should make distributed concurrent programming in Haskell far less tricky. -- http://wagerlabs.com/

Robert Dockins

13 Dec 13 Dec

3:18 p.m.

BTW, there has already been some work in this area. http://www-i2.informatik.rwth-aachen.de/~stolz/dhs/ http://www.informatik.uni-kiel.de/~fhu/PUBLICATIONS/1999/ifl.html

Sebastian Sylvan

12:28 a.m.

On 12/12/05, Joel Reymont wrote:

...

Folks,

I love the Erlang multi-processing experience and think that a lot of the mistakes that I made could be avoided. What I want to have is

1) Processes, aka threads with single-slot in/out mailboxes 2) A facility to keep a list of such processes and send events to them using their process id 3) A socket reader/writer abstraction that communicates with the outside world using using its mailboxes

Probably some other things but I would start with the above. I also want to use STM for this.

One particular thing that bugs me is that I cannot really use TChan for thread mailboxes. I don't think I experienced this problem with Erlang but using a TChan with a logger thread quickly overwhelms the logger and fills the TChan and a lot (hundreds? thousands) of other threads are logging to it. Someone said it's because the scheduler would give ther other threads proportionally more attention.

You could use a bounded TChan. Chans are good because the smooth out "noise".. Ie a sudden surge of messages can be handled without stalling the senders, but sustained heavy traffic will cause a stall (preventing the TChan from growing too big). Here's an untested, off-the-top-of-my-head, implementation.. I may have gotten some names wrong but it should be quite straightforward to write a "real" implementation.. -- may want a newtype here? type BoundedTChan a = (TVar Int, Int, TChan a) -- (current size, max, chan) newBoundedTChan n = do sz <- newTVar 0 ch <- newTChan return (sz,n,ch) writeBoundedTChan (sz,mx,ch) x = do s <- readTVar sz when (s >= mx) retry writeTVar sz (s+1) writeChan ch x readBoundedTChan (sz,mx,ch) = do modifyTVar sz (-1) readTChan ch (s+1) /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

Bulat Ziganshin

1:13 a.m.

Hello Joel, Monday, December 12, 2005, 7:00:46 PM, you wrote: JR> 1) Processes, aka threads with single-slot in/out mailboxes are you read dewscription of my own Process library in haskell maillist? JR> One particular thing that bugs me is that I cannot really use TChan JR> for thread mailboxes. i use. but i limit number of messages in this channel by additional tools. you can easily do the same. but first ask yourself - what you will gain by this? imho, it will only help to smooth temporary speed changes. if you just want to test whether this can speed up your program - implement such limited Channel and test whether it works btw, i suggested you to try not using logging thread entirely, making all logging actions synchronously JR> I found single-slot mailboxes (TMVar) to work much better as they JR> pace the overall message flow. Using them means that asynchronous JR> messages cannot be implemented, though. not exactly. they can hold at most one message i think that your aspiration to make things asynchronous is just sort of fashion. what you really want to get? -- Best regards, Bulat mailto:bulatz@HotPOP.com

Joel Reymont

10:05 a.m.

On Dec 13, 2005, at 1:13 AM, Bulat Ziganshin wrote:

...

are you read dewscription of my own Process library in haskell maillist?

No. Can you give me a pointer?

...

btw, i suggested you to try not using logging thread entirely, making all logging actions synchronously

I cannot. Only one thread can use stdout, otherwise the output is garbled. Plus, combing through a few thousand individual files produced by the threads would be a pain.

...

i think that your aspiration to make things asynchronous is just sort of fashion. what you really want to get?

I have no such aspirations. I was just making a point. Joel -- http://wagerlabs.com/

Bulat Ziganshin

11:17 a.m.

New subject: Re[2]: Bringing Erlang to Haskell

Hello Joel, Tuesday, December 13, 2005, 1:05:10 PM, you wrote:

...

...
are you read dewscription of my own Process library in haskell maillist?

JR> No. Can you give me a pointer? i will forward it to you. it have meaning to be subcribed there, just to see interesting announcements

...

...
btw, i suggested you to try not using logging thread entirely, making all logging actions synchronously

JR> I cannot. Only one thread can use stdout, otherwise the output is JR> garbled. Plus, combing through a few thousand individual files JR> produced by the threads would be a pain. :))) import Control.Concurrent import Control.Monad import System.IO import System.IO.Unsafe main = do h <- openBinaryFile "test" WriteMode for [1..100] $ \n -> forkIO $ for [1..] $ \i -> logger h ("thread "++show n++" msg "++show i) getLine hClose h lock = unsafePerformIO$ newMVar () logger h msg = withMVar lock $ const$ do hPutStrLn h msg putStrLn msg for = flip mapM_ -- Best regards, Bulat mailto:bulatz@HotPOP.com

Joel Reymont

11:31 a.m.

New subject: Re[2]: Bringing Erlang to Haskell

Thank you Bulat, makes total sense. This list is a treasure trove of a resource. I guess this is what happens when you go from Erlang to Haskell :-). I'm conditioned to think of everything as a process and uses processes for everything. On Dec 13, 2005, at 11:17 AM, Bulat Ziganshin wrote:

...

import Control.Concurrent import Control.Monad import System.IO import System.IO.Unsafe

main = do h <- openBinaryFile "test" WriteMode for [1..100] $ \n -> forkIO $ for [1..] $ \i -> logger h ("thread "++show n++" msg "++show i) getLine hClose h

lock = unsafePerformIO$ newMVar ()

logger h msg = withMVar lock $ const$ do hPutStrLn h msg putStrLn msg

for = flip mapM_

-- http://wagerlabs.com/

Tomasz Zielonka

9:49 a.m.

On Mon, Dec 12, 2005 at 04:00:46PM +0000, Joel Reymont wrote:

...

One particular thing that bugs me is that I cannot really use TChan for thread mailboxes. I don't think I experienced this problem with Erlang but using a TChan with a logger thread quickly overwhelms the logger and fills the TChan and a lot (hundreds? thousands) of other threads are logging to it.

I wonder what Erlang does to solve this problem? Perhaps we should track the number of unprocessed messages in TChans and the bigger it is the more favor consumers over producers. Best regards Tomasz -- I am searching for a programmer who is good at least in some of [Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland

Bulat Ziganshin

10:05 a.m.

New subject: Re[2]: Bringing Erlang to Haskell

Hello Tomasz, Tuesday, December 13, 2005, 12:49:04 PM, you wrote: TZ> On Mon, Dec 12, 2005 at 04:00:46PM +0000, Joel Reymont wrote:

...

...
One particular thing that bugs me is that I cannot really use TChan for thread mailboxes. I don't think I experienced this problem with Erlang but using a TChan with a logger thread quickly overwhelms the logger and fills the TChan and a lot (hundreds? thousands) of other threads are logging to it.

TZ> I wonder what Erlang does to solve this problem? Perhaps we should track TZ> the number of unprocessed messages in TChans and the bigger it is TZ> the more favor consumers over producers. even best - always prefer consumer to producer :) may be have two lists - one of threads waiting to consume, and one of threads waiting to produce? -- Best regards, Bulat mailto:bulatz@HotPOP.com

Joel Reymont

10:09 a.m.

On Dec 13, 2005, at 9:49 AM, Tomasz Zielonka wrote:

...

On Mon, Dec 12, 2005 at 04:00:46PM +0000, Joel Reymont wrote:

...
One particular thing that bugs me is that I cannot really use TChan for thread mailboxes. I don't think I experienced this problem with Erlang but using a TChan with a logger thread quickly overwhelms the logger and fills the TChan and a lot (hundreds? thousands) of other threads are logging to it.

I wonder what Erlang does to solve this problem? Perhaps we should track the number of unprocessed messages in TChans and the bigger it is the more favor consumers over producers.

It sounds to me that introducing thread priorities would be key. Here's a reply from Ulf Wiger (and Erlang expert): Erlang has four process priorities: - 'max' and 'high' are strict priorities (you stay at that level while there are processes ready to run) - normal and low are scheduled "fairly", I believe with 8000 "reductions" (roughly function calls) at normal priority and one low-priority job (if such a job exists) The scheduler works on reduction count. A context switch happens if the current process would block (e.g. if it's in a receive statement and there is no matching message in the queue), or when it's executed 1000 reductions. Since not all operations are equal in cost, there is an internal function called erlang:bump_reductions(N). File operations are usually followed by a call to erlang:bump_reductions(100) (see prim_file:get_drv_response/1) which means that processes writing to disk run out of their time slice in fewer function calls. This is of course to keep them from getting an unfair amount of CPU time. A logging process would probably therefore do well to fetch all messages in the message queue and write them using disk_log:alog_terms/2 (logging multiple messages each time). One could also possibly run the logger process at high priority. This means that normal priority process will have a hard time starving it. If the disk is slow, the logger process will yield while waiting for the disk (which won't block the runtime system as long as you have the thread pool enabled). In general, I think that processes that mainly dispatch messages, and don't generate any work on their own, should usually run on high priority. Otherwise, they tend to just contribute to delays in the system. /Uffe -- http://wagerlabs.com/

7145

Age (days ago)

7146

Last active (days ago)

List overview

Download

10 comments

5 participants

participants (5)

Bulat Ziganshin
Joel Reymont
Robert Dockins
Sebastian Sylvan
Tomasz Zielonka