Waiting for thread to finish

Maurício

27 Nov 2007 27 Nov '07

4:01 p.m.

Hi, After I have spawned a thread with 'forkIO', how can I check if that thread work has finished already? Or wait for it? Thanks, Maurício

Show replies by date

Don Stewart

27 Nov 27 Nov

4:32 p.m.

briqueabraque:

...

Hi,

After I have spawned a thread with 'forkIO', how can I check if that thread work has finished already? Or wait for it?

The usual trick I use is to have an MVar that child thread can set when done, causing the main thread to wait: main = do done <- newEmptyMVar forkIO (fibonacci done) takeMVar done -- blocks till MVar is full print "All done" fibonacci = do ... do some work .. putMVar done () -- ok, main thread can finish now

ChrisK

5:59 p.m.

Maurício wrote:

...

Hi,

After I have spawned a thread with 'forkIO', how can I check if that thread work has finished already? Or wait for it?

Thanks, Maurício

The best way to do this is using Control.Exception.finally:

...

myFork :: IO () -> IO (ThreadId,MVar ()) myFork todo = m <- newEmptyMVar tid <- forkIO (finally todo (tryPutMVar m ())) return (tid,m)

No other part of the program should write to the MVar except the finally clause. The rest of the program can check (isEmptyMVar m) as a non-blocking way to see if the thread is still running. Or use (swapMVar m ()) as a way to block until the MVar has been filled as a way of blocking until the thread is finished. These techniques are needed because forkIO is a very lightweight threading mechanism. Adding precisely the features you need makes for good performance control, as seen in the great computer language shootout benchmarks. Cheers, Chris

Maurício

7:57 p.m.

...

...
Hi,

After I have spawned a thread with 'forkIO', how can I check if that thread work has finished already? Or wait for it?

...

The best way to do this is using Control.Exception.finally: (...)

These techniques are needed because forkIO is a very lightweight threading mechanism. Adding precisely the features you need makes for good performance control, as seen in the great computer language shootout benchmarks.

Changing ugly code for bad performance is not that usual in Haskell code :( Best, Maurício

Don Stewart

8:09 p.m.

briqueabraque:

...

...
...
Hi,

After I have spawned a thread with 'forkIO', how can I check if that thread work has finished already? Or wait for it?

...
The best way to do this is using Control.Exception.finally: (...)

These techniques are needed because forkIO is a very lightweight threading mechanism. Adding precisely the features you need makes for good performance control, as seen in the great computer language shootout benchmarks.

Changing ugly code for bad performance is not that usual in Haskell code :(

I think you misunderstand Chris' remark. He's saying that MVars and forkIO give you bot clean control, and high performance. This code seems quite elegant, for the job you were asking: import Control.Concurrent import Control.Exception main = do done <- run (print (last [1..100000000])) print "Waiting...." takeMVar done print "OK." where run f = do x <- newEmptyMVar forkIO (f `finally` putMVar x ()) return x And the lovely thread-ring benchmark, is also very nice: http://shootout.alioth.debian.org/gp4/benchmark.php?test=threadring&lang=all Where the Haskell code is both the shortest, and fastest. Beautiful code can be very efficient. -- Don

Brad Clow

28 Nov 28 Nov

12:34 a.m.

If you would like to wait on multiple threads, you can use STM like so: import Control.Concurrent import Control.Concurrent.STM import Control.Exception main = do tc <- atomically $ newTVar 2 run tc (print (last [1..100000000])) run tc (print (last [1..110000000])) print "Waiting...." atomically $ readTVar tc >>= \x -> if x == 0 then return () else retry print "OK." where run tc f = forkIO (f `finally` atomReplace (\x -> x - 1) tc) atomReplace fn x = atomically $ readTVar x >>= writeTVar x . fn Regards brad -- www.scoodi.com Recycle is good: Reuse is better

Brad Clow

12:46 a.m.

I was just watching top while executing this and noticed that it really only used one core (I am using GHC 6.8.1 on a MacBook). Does anyone know why? On Nov 28, 2007 10:34 AM, Brad Clow wrote:

...

If you would like to wait on multiple threads, you can use STM like so:

import Control.Concurrent import Control.Concurrent.STM import Control.Exception

main = do tc <- atomically $ newTVar 2 run tc (print (last [1..100000000])) run tc (print (last [1..110000000])) print "Waiting...." atomically $ readTVar tc >>= \x -> if x == 0 then return () else retry print "OK." where run tc f = forkIO (f `finally` atomReplace (\x -> x - 1) tc)

atomReplace fn x = atomically $ readTVar x >>= writeTVar x . fn

Regards brad -- www.scoodi.com Recycle is good: Reuse is better

Spencer Janssen

12:54 a.m.

On Tuesday 27 November 2007 18:46:00 Brad Clow wrote:

...

I was just watching top while executing this and noticed that it really only used one core (I am using GHC 6.8.1 on a MacBook). Does anyone know why?

Did you compile with -threaded, and run with +RTS -N2? Cheers, Spencer Janssen

Brad Clow

1:06 a.m.

Silly mistake. I had compiled with -threaded, but forgot the +RTS -N2. However, I have a more complex app, where I haven't forgotton to use the right flags :-) and the utilisation of cores is very poor. I am thinking it is due to laziness. I am currently wondering how GHC handles the case where the function that is being forked uses lazy arguments? On Nov 28, 2007 10:54 AM, Spencer Janssen wrote:

...

Did you compile with -threaded, and run with +RTS -N2?

Regards brad -- www.scoodi.com Recycle is good: Reuse is better

Dan Weston

1:21 a.m.

Silly or not, if I compile with -threaded, I always link in the one-liner C file: char *ghc_rts_opts = "-N2"; so I don't have to remember at runtime whether it should run with 2 cores or not. This just changes the default to 2 cores, so I am still free to run on only one core with the runtime flags +RTS -N1, though I rarely need to. http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html... Dan Brad Clow wrote:

...

Silly mistake. I had compiled with -threaded, but forgot the +RTS -N2.

However, I have a more complex app, where I haven't forgotton to use the right flags :-) and the utilisation of cores is very poor. I am thinking it is due to laziness. I am currently wondering how GHC handles the case where the function that is being forked uses lazy arguments?

On Nov 28, 2007 10:54 AM, Spencer Janssen wrote:

...
Did you compile with -threaded, and run with +RTS -N2?

Regards brad

Andrew Coppin

9:28 p.m.

Dan Weston wrote:

...

Silly or not, if I compile with -threaded, I always link in the one-liner C file:

char *ghc_rts_opts = "-N2";

so I don't have to remember at runtime whether it should run with 2 cores or not. This just changes the default to 2 cores, so I am still free to run on only one core with the runtime flags +RTS -N1, though I rarely need to.

http://www.haskell.org/ghc/docs/latest/html/users_guide/runtime-control.html...

Ah... you learn something useful every day! I was going to ask on IRC whether there's any way to do this - but now I don't need to bother. :-)

Bryan O'Sullivan

10:30 p.m.

Andrew Coppin wrote:

...

Dan Weston wrote:

...
Silly or not, if I compile with -threaded, I always link in the one-liner C file:

char *ghc_rts_opts = "-N2";

Ah... you learn something useful every day! I was going to ask on IRC whether there's any way to do this - but now I don't need to bother. :-)

But wait, there's more! If you're using the threaded RTS, you often need to know how many threads you can run concurrently, for example to explicitly split up a compute-bound task. This value is exposed at runtime by the numCapabilities variable in the GHC.Conc module. http://www.haskell.org/ghc/docs/latest/html/libraries/base-3.0.0.0/GHC-Conc.... This variable is new in GHC 6.8.1 (thanks, Simon!), so don't try to use it with an older release.

Andrew Coppin

29 Nov 29 Nov

6:38 p.m.

Bryan O'Sullivan wrote:

...

But wait, there's more! If you're using the threaded RTS, you often need to know how many threads you can run concurrently, for example to explicitly split up a compute-bound task. This value is exposed at runtime by the numCapabilities variable in the GHC.Conc module.

http://www.haskell.org/ghc/docs/latest/html/libraries/base-3.0.0.0/GHC-Conc....

This variable is new in GHC 6.8.1 (thanks, Simon!), so don't try to use it with an older release.

Hmm... I was *sure* this was exposed in Control.Concurrent already... but, apparently, no. It seems you can only get at it from GHC.Conc. That's kind of a pitty... oh well! ;-)

Brandon S. Allbery KF8NH

7 p.m.

On Nov 29, 2007, at 13:38 , Andrew Coppin wrote:

...

Bryan O'Sullivan wrote:

...
But wait, there's more! If you're using the threaded RTS, you often need to know how many threads you can run concurrently, for example to explicitly split up a compute-bound task. This value is exposed at runtime by the numCapabilities variable in the GHC.Conc module.

http://www.haskell.org/ghc/docs/latest/html/libraries/base-3.0.0.0/ GHC-Conc.html#v%3AnumCapabilities

This variable is new in GHC 6.8.1 (thanks, Simon!), so don't try to use it with an older release.

Hmm... I was *sure* this was exposed in Control.Concurrent already... but, apparently, no. It seems you can only get at it from GHC.Conc. That's kind of a pitty... oh well! ;-)

It's internal implementation foo; why would it be part of an interface intended to be reasonably portable? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Matthew Brecknell

28 Nov 28 Nov

1:30 a.m.

Brad Clow:

...

However, I have a more complex app, where I haven't forgotton to use the right flags :-) and the utilisation of cores is very poor. I am thinking it is due to laziness. I am currently wondering how GHC handles the case where the function that is being forked uses lazy arguments?

Even with threads, results are evaluated only when they are needed (or when forced by a strictness annotation). So the thread that needs a result (or forces it) first will be the one to evaluate it. Did you see Don's strict-concurrency announcement yesterday? http://www.haskell.org/pipermail/haskell-cafe/2007-November/035292.html

Brad Clow

1:47 a.m.

On Nov 28, 2007 11:30 AM, Matthew Brecknell wrote:

...

Even with threads, results are evaluated only when they are needed (or when forced by a strictness annotation). So the thread that needs a result (or forces it) first will be the one to evaluate it.

So does GHC implement some sychronisation given that a mutation is occuring under the covers, ie. the thunk is being replaced by the result?

...

Did you see Don's strict-concurrency announcement yesterday?

http://www.haskell.org/pipermail/haskell-cafe/2007-November/035292.html

Yes. I am using a TVar to build results of forked functions in. I had a quick go at changing to channels so I could use Dons library but kept getting blocking exceptions, so I have left it as is for the moment. Regards brad -- www.scoodi.com Recycle is good: Reuse is better

Matthew Brecknell

4:39 a.m.

Brad Clow:

...

So does GHC implement some sychronisation given that a mutation is occuring under the covers, ie. the thunk is being replaced by the result?

I believe so, but I have no idea of the details.

...

I am using a TVar to build results of forked functions in. I had a quick go at changing to channels so I could use Dons library but kept getting blocking exceptions, so I have left it as is for the moment.

Don's library is fairly simple. It adds a strictness annotation to force each value you write to a MVar or Chan, so for example, (Control.Concurrent.MVar.Strict.putMVar v x) is basically equivalent to (Control.Concurrent.MVar.putMVar v $! x). This is useful for returning results from worker threads, because it makes it more likely that the worker thread actually does the work. I say, "more likely", because the strictness annotation only forces the value to WHNF. If you have a deep structure, you might need a more sophisticated forcing function. Since you're using STM, Don's library doesn't (yet) help you, though that ought to be easy to fix. In the meantime, you can at least apply the essential idea, which means using (writeTVar v $! x) instead of (writeTVar v x) when returning results from a worker thread.

Brad Clow

5:11 a.m.

On Nov 28, 2007 2:39 PM, Matthew Brecknell wrote:

...

Brad Clow:

Don's library is fairly simple. It adds a strictness annotation to force each value you write to a MVar or Chan, so for example, (Control.Concurrent.MVar.Strict.putMVar v x) is basically equivalent to (Control.Concurrent.MVar.putMVar v $! x).

This is useful for returning results from worker threads, because it makes it more likely that the worker thread actually does the work. I say, "more likely", because the strictness annotation only forces the value to WHNF. If you have a deep structure, you might need a more sophisticated forcing function.

When I (deeply) force the worker thread's results to be strict, I observe both cores working, but the execution time (elapsed) slower. As much as a like the Haskell type system, sometimes butting your head up against a wall is less painful than trying to optimise for speed. Regards brad -- www.scoodi.com Recycle is good: Reuse is better

Matthew Brecknell

29 Nov 29 Nov

12:29 a.m.

Brad Clow:

...

When I (deeply) force the worker thread's results to be strict, I observe both cores working, but the execution time (elapsed) slower.

I can only speculate, but since you emphasise deep forcing, I wonder how deep is the structure returned from the worker thread? Could it be deep enough to hurt CPU cache performance (compared to an unforced version where production and consumption are interleaved)?

Simon Marlow

3 Dec 3 Dec

1:10 p.m.

Brad Clow wrote:

...

On Nov 28, 2007 11:30 AM, Matthew Brecknell wrote:

...
Even with threads, results are evaluated only when they are needed (or when forced by a strictness annotation). So the thread that needs a result (or forces it) first will be the one to evaluate it.

So does GHC implement some sychronisation given that a mutation is occuring under the covers, ie. the thunk is being replaced by the result?

Yes, see http://haskell.org/~simonmar/bib/multiproc05_abstract.html we use lock-free synchronisation, with a slight possibility that two threads might evaluate the same thunk. But since they'll produce the same result, nothing goes wrong. Cheers, Simon

Matthew Brecknell

28 Nov 28 Nov

3:23 a.m.

Brad Clow:

...

If you would like to wait on multiple threads, you can use STM like so:

import Control.Concurrent import Control.Concurrent.STM import Control.Exception

main = do tc <- atomically $ newTVar 2 run tc (print (last [1..100000000])) run tc (print (last [1..110000000])) print "Waiting...." atomically $ readTVar tc >>= \x -> if x == 0 then return () else retry print "OK." where run tc f = forkIO (f `finally` atomReplace (\x -> x - 1) tc)

atomReplace fn x = atomically $ readTVar x >>= writeTVar x . fn

Nice! Although, to wait for all of a set of threads, you really only need to wait for each in turn, so you could do this with plain MVars. The real power of STM becomes apparent when you need to wait for any of a set of results, for example:

...

import Control.Arrow import Control.Concurrent import Control.Concurrent.STM import Control.Concurrent.STM.TVar

newtype Wait a = Wait (TVar (Maybe a))

fork :: IO a -> IO (Wait a) fork m = do w <- atomically (newTVar Nothing) forkIO (m >>= atomically . writeTVar w . Just) return (Wait w)

wait :: Wait a -> IO a wait (Wait w) = atomically $ do r <- readTVar w case r of Just a -> return a Nothing -> retry

wait_all :: [Wait a] -> IO [a] wait_all [] = return [] wait_all (w:ws) = do r <- wait w t <- wait_all ws return (r:t)

wait_first :: [Wait a] -> IO (a, [Wait a]) wait_first [] = error "wait_first: nothing to wait for" wait_first ws = atomically (do_wait ws) where do_wait [] = retry do_wait (Wait w : ws) = do r <- readTVar w case r of Nothing -> fmap (second (Wait w:)) (do_wait ws) Just s -> return (s,ws)

main = do w1 <- fork (test 50000000) w2 <- fork (test 10000000) w3 <- fork (test 100000000) (r,ws) <- wait_first [w1,w2,w3] putStrLn ("First result: " ++ show r) rs <- wait_all ws putStrLn ("Remaining results: " ++ show rs)

test :: Integer -> IO Integer test i = do let r = last [1..i] putStrLn ("Result " ++ show r) return r

You might recognise the Wait type as being identical to TMVar, although I use a slightly different set of operations. Throw Control.Concurrent.STM.TChan into the mix, and you have some very rich possibilities indeed.

Ryan Ingram

3 Dec 3 Dec

10:39 p.m.

On 11/27/07, Matthew Brecknell wrote:

...

...
wait_first :: [Wait a] -> IO (a, [Wait a]) wait_first [] = error "wait_first: nothing to wait for" wait_first ws = atomically (do_wait ws) where do_wait [] = retry do_wait (Wait w : ws) = do r <- readTVar w case r of Nothing -> fmap (second (Wait w:)) (do_wait ws) Just s -> return (s,ws)

Interesting, although this seems like a perfect use for "orelse":

...

wait_stm :: Wait a -> STM a wait_stm (Wait w) = readTVar w >>= maybe retry return

...

wait :: Wait a -> IO a wait w = atomically $ wait_stm w

...

wait_first :: [Wait a] -> IO (a, [Wait a]) wait_first [] = error "wait_first: nothing to wait for" wait_first ws = atomically (do_wait ws) where do_wait [] = retry do_wait (w : ws) = do r <- wait_stm w return (r, ws) `orelse` fmap (second (w:)) (do_wait ws)

Matthew Brecknell

4 Dec 4 Dec

2:52 a.m.

Ryan Ingram said:

...

Interesting, although this seems like a perfect use for "orelse":

...
wait_stm :: Wait a -> STM a wait_stm (Wait w) = readTVar w >>= maybe retry return

...
wait :: Wait a -> IO a wait w = atomically $ wait_stm w

...
wait_first :: [Wait a] -> IO (a, [Wait a]) wait_first [] = error "wait_first: nothing to wait for" wait_first ws = atomically (do_wait ws) where do_wait [] = retry do_wait (w : ws) = do r <- wait_stm w return (r, ws) `orelse` fmap (second (w:)) (do_wait ws)

Indeed, that is very nice. I see now that orElse allows wait_stm to compose easily, so you don't need to keep opening up the insides of the Wait variable.

Ben Franksen

28 Nov 28 Nov

1:17 a.m.

Don Stewart wrote:

...

Beautiful code can be very efficient.

Indeed, as I recently the opportunity to demonstrate to my co-workers by re-implementing a script written in Perl that turned out to be just a bit too slow to be useful in practice. My version was not only one tenth the code size and about 20 times faster, it also revealed bugs in the original implementation as well as in the specification. It was fun, too. For instance, the OP's question reminded me of a little generic wrapper I wrote -- more or less for my own amusement -- during the course of this project. It outputs dots during an operation that might take a little longer to finish (a database query in my case)... just so the user doesn't get nervous ;-) And because I enjoyed it so much (and to show off) I threw in the timing measurement... module Gimmick where import Control.Concurrent import Control.Exception import System.CPUTime import System.IO tickWhileDoing :: String -> IO a -> IO a tickWhileDoing msg act = do hPutStr stderr msg >> hPutChar stderr ' ' >> hFlush stderr start_time <- getCPUTime tickerId <- forkIO ticker res <- act `finally` killThread tickerId stop_time <- getCPUTime let time_diff = realToFrac (stop_time - start_time) / 1e12 hPutStrLn stderr $ " done (took us " ++ show time_diff ++ " seconds)" return res where ticker = do hPutChar stderr '.' >> hFlush stderr threadDelay 100000 {-microsec-} ticker I think nobody in his right mind would even try to do something like that in C or Perl or whatever, at least not if it wasn't strictly a requirement and correct operation is important (the script gets executed as part of our build process and a subtle concurrency bug could lead to a wrong configuration for the target control system). In Haskell it was so easy to do that I just couldn't resist. Cheers Ben PS (completely off-topic, sorry): I've been using the collections library throughout the project & I must say it is a lot nicer to work with than the base library mumble-jumble of duplicate interfaces, qualified imports and whatnot. The only disadvantages are that the API is not yet as complete as e.g. Data.Map, and that I have to manually hide name-clashing Prelude functions in almost every module. Many thanks to Jean-Philippe Bernardy for pioneering this work.

Ketil Malde

9:01 a.m.

New subject: Collections library

Ben Franksen writes:

...

PS (completely off-topic, sorry): I've been using the collections library throughout the project & I must say it is a lot nicer to work with

I tried to Google for this, and ended up at http://hackage.haskell.org/trac/ghc/wiki/CollectionClassFramework The only link that seems to work is the one that is marked as outdated. I've replaced them with a link to Hackage, but somebody who knows more about this might want to recheck the facts on the page. -k -- If I haven't seen further, it is by standing in the footprints of giants

Adrian Hey

10:23 a.m.

New subject: Collections library

Ketil Malde wrote:

...

Ben Franksen writes:

...
PS (completely off-topic, sorry): I've been using the collections library throughout the project & I must say it is a lot nicer to work with

I tried to Google for this, and ended up at

http://hackage.haskell.org/trac/ghc/wiki/CollectionClassFramework

The only link that seems to work is the one that is marked as outdated. I've replaced them with a link to Hackage, but somebody who knows more about this might want to recheck the facts on the page.

It currently lives here as a darcs repo.. http://code.haskell.org/collections/collections-ghc6.8 ..and is in the process of being 6.8ified and split up into separate smaller packages for hackage. I think one of the problems with it as one package (apart from it's size) is that different bits of it were in different states of real world readiness. Some of it quite stable (e.g. all the AVL tree stuff and Data.Map/Set clones) and some was still actively being worked on (the Data.Trie.General stuff) and this kinda stopped a stable hackage package for everything. I recently withdrew from this project and offered up the libs I'd been working on as they are for a new owner. Didn't get any takers though (no surprises there!). I've always found the lack of apparent interest in all this somewhat puzzling myself. It's not as if there's no latent demand for efficient collections. (Data.Map is probably the most regularly whined about of all the "standard" libs.) Regards -- Adrian Hey

Robert Dockins

2:42 p.m.

New subject: Collections library

[snip]

...

I recently withdrew from this project and offered up the libs I'd been working on as they are for a new owner. Didn't get any takers though (no surprises there!). I've always found the lack of apparent interest in all this somewhat puzzling myself. It's not as if there's no latent demand for efficient collections. (Data.Map is probably the most regularly whined about of all the "standard" libs.)

FWIW, I find the same phenomenon with Edison. I get very little feedback about it positive or negative; I really have no idea how many people are using it. I guess people are more willing to roll their own data structures or use the standard libs. It might be from a desire to limit dependencies. If that's the case, perhaps continuing cabal developments will change that.

...

Regards -- Adrian Hey

Henning Thielemann

2:39 p.m.

New subject: Collections library

On Wed, 28 Nov 2007, Robert Dockins wrote:

...

FWIW, I find the same phenomenon with Edison. I get very little feedback about it positive or negative; I really have no idea how many people are using it. I guess people are more willing to roll their own data structures or use the standard libs.

It might be from a desire to limit dependencies.

For me this is a strong reason, yes. It's not only the immediate dependency, but the dependent library might rely on other libraries or compiler features and thus decreases portability.

...

If that's the case, perhaps continuing cabal developments will change that.

I hope that this will solve the problem. Since I recently got to know that Edison contains EnumSet, which I think is useful for many FFIs, I will certainly use Edison in future.

ChrisK

11:47 a.m.

A safer gimmick... Ben Franksen wrote:

...

tickWhileDoing :: String -> IO a -> IO a tickWhileDoing msg act = do hPutStr stderr msg >> hPutChar stderr ' ' >> hFlush stderr start_time <- getCPUTime tickerId <- forkIO ticker

... an async exception here will leave the ticker runnning....

...

res <- act `finally` killThread tickerId

The best way to make this safe that I know of is:

...

res <- block $ do tickerId <- forkIO ticker unblock act `finally` killThread tickerId

...

stop_time <- getCPUTime let time_diff = realToFrac (stop_time - start_time) / 1e12 hPutStrLn stderr $ " done (took us " ++ show time_diff ++ " seconds)" return res where ticker = do hPutChar stderr '.' >> hFlush stderr threadDelay 100000 {-microsec-} ticker

I think nobody in his right mind would even try to do something like that in C or Perl or whatever, at least not if it wasn't strictly a requirement and correct operation is important (the script gets executed as part of our build process and a subtle concurrency bug could lead to a wrong configuration for the target control system). In Haskell it was so easy to do that I just couldn't resist.

Cheers Ben

PS (completely off-topic, sorry): I've been using the collections library throughout the project & I must say it is a lot nicer to work with than the base library mumble-jumble of duplicate interfaces, qualified imports and whatnot. The only disadvantages are that the API is not yet as complete as e.g. Data.Map, and that I have to manually hide name-clashing Prelude functions in almost every module. Many thanks to Jean-Philippe Bernardy for pioneering this work.

Ben Franksen

3 Dec 3 Dec

11:22 p.m.

Belatedly I realized that this answer should have been going to the list: --------------- ChrisK wrote: On Mittwoch, 28. November 2007, you wrote:

...

A safer gimmick...

Ben Franksen wrote:

...
tickWhileDoing :: String -> IO a -> IO a tickWhileDoing msg act = do hPutStr stderr msg >> hPutChar stderr ' ' >> hFlush stderr start_time <- getCPUTime tickerId <- forkIO ticker

... an async exception here will leave the ticker runnning....

...
res <- act `finally` killThread tickerId

Thanks for spotting this loophole. I keep forgetting people tend to hit Ctrl-C whenever they feel like it... ;-) Thinking some more about this, I realise that the async exception could also come from somewhere inside the Haskell program (e.g. from a killThread like I did myself in the next line.) So the fix below makes this whole things more robust indeed.

...

The best way to make this safe that I know of is:

...
res <- block $ do tickerId <- forkIO ticker unblock act `finally` killThread tickerId

Yes. Cheers Ben

Jules Bean

6 Dec 6 Dec

4:28 p.m.

ChrisK wrote:

...

A safer gimmick...

Ben Franksen wrote:

...
tickWhileDoing :: String -> IO a -> IO a tickWhileDoing msg act = do hPutStr stderr msg >> hPutChar stderr ' ' >> hFlush stderr start_time <- getCPUTime tickerId <- forkIO ticker ... an async exception here will leave the ticker runnning.... res <- act `finally` killThread tickerId

The best way to make this safe that I know of is:

...
res <- block $ do tickerId <- forkIO ticker unblock act `finally` killThread tickerId

...but with a change that Simon M just checked in to GHC head, this will now spawn 'ticker' in blocked state, so you won't be able to kill it. You would therefore want unblock $ forkIO ticker or forkIO $ unblock ticker I'm not sure if there is a strong reason to prefer one over the other. Jules

ChrisK

5:52 p.m.

Jules Bean wrote:

...

ChrisK wrote:

...
A safer gimmick...

Ben Franksen wrote:

...
tickWhileDoing :: String -> IO a -> IO a tickWhileDoing msg act = do hPutStr stderr msg >> hPutChar stderr ' ' >> hFlush stderr start_time <- getCPUTime tickerId <- forkIO ticker ... an async exception here will leave the ticker runnning.... res <- act `finally` killThread tickerId

The best way to make this safe that I know of is:

...
res <- block $ do tickerId <- forkIO ticker unblock act `finally` killThread tickerId

...but with a change that Simon M just checked in to GHC head, this will now spawn 'ticker' in blocked state, so you won't be able to kill it. You would therefore want unblock $ forkIO ticker or forkIO $ unblock ticker

I'm not sure if there is a strong reason to prefer one over the other.

Jules

That is new. Ah, I see GHC.Conc.forkIO now has a note:

...

GHC note: the new thread inherits the /blocked/ state of the parent (see 'Control.Exception.block').

BUT...doesn't this change some of the semantics of old code that used forkIO ? I wanted a way to control the blocked status of new threads, since this makes it easier to be _sure_ some race conditions will never happen. And so my new preferred way of writing this is now:

...

-- we are in parent's blocked state, so make the ticker explicit: res <- bracket (forkIO (unblock ticker)) killThread const act -- act runs in parent's blocked state

-- Chris

Simon Marlow

11 Dec 11 Dec

5:01 p.m.

ChrisK wrote:

...

That is new. Ah, I see GHC.Conc.forkIO now has a note:

...
GHC note: the new thread inherits the /blocked/ state of the parent (see 'Control.Exception.block').

BUT...doesn't this change some of the semantics of old code that used forkIO ?

Yes, it is a change to the semantics. I assumed (naively) that virtually nobody would be using forkIO inside block, and so the change would be benign. It is (another) departure from the semantics in the Asynchronous Exceptions paper. Still, I think this is the right thing.

...

I wanted a way to control the blocked status of new threads, since this makes it easier to be _sure_ some race conditions will never happen.

And so my new preferred way of writing this is now:

...
-- we are in parent's blocked state, so make the ticker explicit: res <- bracket (forkIO (unblock ticker)) killThread const act -- act runs in parent's blocked state

In this case the unblock isn't strictly necessary, because the ticker thread spends most of its time in threadDelay, which is interruptible anyway. Cheers, Simon

Paul Moore

28 Nov 28 Nov

1:16 p.m.

On 28/11/2007, Ben Franksen wrote:

...

It was fun, too. For instance, the OP's question reminded me of a little generic wrapper I wrote -- more or less for my own amusement -- during the course of this project. It outputs dots during an operation that might take a little longer to finish (a database query in my case)... just so the user doesn't get nervous ;-) And because I enjoyed it so much (and to show off) I threw in the timing measurement... [...] I think nobody in his right mind would even try to do something like that in C or Perl or whatever, at least not if it wasn't strictly a requirement and correct operation is important (the script gets executed as part of our build process and a subtle concurrency bug could lead to a wrong configuration for the target control system). In Haskell it was so easy to do that I just couldn't resist.

That's a neat idea. Just (a) because I like the idea, and (b) because I'm contrary :-) I coded up the equivalent in Python. It also looks beautifully clean: from __future__ import with_statement import threading import sys # Implementation of Ticker class class Ticker(threading.Thread): def __init__(self, msg): threading.Thread.__init__(self) self.msg = msg self.event = threading.Event() def __enter__(self): self.start() def __exit__(self, ex_type, ex_value, ex_traceback): self.event.set() self.join() def run(self): sys.stdout.write(self.msg) while not self.event.isSet(): sys.stdout.write(".") sys.stdout.flush() self.event.wait(1) # Here's how we use it... if __name__ == '__main__': import time with Ticker("A test"): time.sleep(10) with Ticker("Second test"): time.sleep(5) raise Exception("Bang!") Paul.

Ben Franksen

1 Dec 1 Dec

3:53 a.m.

Paul Moore wrote:

...

On 28/11/2007, Ben Franksen wrote:

...
It was fun, too. For instance, the OP's question reminded me of a little generic wrapper I wrote -- more or less for my own amusement -- during the course of this project. It outputs dots during an operation that might take a little longer to finish (a database query in my case)... just so the user doesn't get nervous ;-) And because I enjoyed it so much (and to show off) I threw in the timing measurement... [...] I think nobody in his right mind would even try to do something like that in C or Perl or whatever, at least not if it wasn't strictly a requirement and correct operation is important (the script gets executed as part of our build process and a subtle concurrency bug could lead to a wrong configuration for the target control system). In Haskell it was so easy to do that I just couldn't resist.

That's a neat idea. Just (a) because I like the idea, and (b) because I'm contrary :-) I coded up the equivalent in Python. It also looks beautifully clean: [snipped python code]

Looks good to me. I' walk back on the "or whatever" with regard to Python... ;-) Cheers Ben

Maurício

28 Nov 28 Nov

10:07 p.m.

...

...
...
...
After I have spawned a thread with 'forkIO', how can I check if that thread work has finished already? Or wait for it?

...

...
...
The best way to do this is using Control.Exception.finally: (...)

...

...
Changing ugly code for bad performance is not that usual in Haskell code :(

...

I think you misunderstand Chris' remark. He's saying that MVars and forkIO give you bot clean control, and high performance.

Sorry if I sound rude. I just saw a place for a small joke, and used it. Chris code is pretty elegant to what it is supposed to do. However, knowing if a thread has finished is just 1 bit of information. There's probably a reason why that would hurt performance, but I don't understand it. For most situations, I believe you want to know when a thread has finished, and have that in the implementation is probably more efficient than creating a MVar for each one. Please understand that I'm not criticising anyone's work, I just want to understand it better. Threads are a deep problem with many issues involved, and I have no proper knowledge of any of them.

...

This code seems quite elegant, for the job you were asking:

import Control.Concurrent import Control.Exception

main = do done <- run (print (last [1..100000000])) print "Waiting...." takeMVar done print "OK." where (...)

Sorry, I don't agree. I try to write things in a way that when you read it you can get an intuition on why it's doing what it's doing; even when the code is for my reading only (which, in Haskell, is almost always the case). For instance: in the code I'm writing now, I need to know if threads have finished since only them I can use the files they generate. So, instead of checking if threads have finished, I decided to check if files exist and are available for writing. When I read 'takeMVar done', it's difficult to think why you want to read a value you are never going to use. But, of course, maybe this is just my prejudice, and if I understand anything about threads I would have a different feeling about it.

...

And the lovely thread-ring benchmark, is also very nice:

http://shootout.alioth.debian.org/gp4/benchmark.php?test=threadring&lang=all

...

(...)

Sorry, I didn't think that's nice either. I read the description of the task, and it's one where in the real world you would never use threads to do it, except for benchmarking threads. Of course, that is important for many people, like those who study threads and their implementation. But do you know of a benchmark where the task is some kind of situation where you actually get a result faster by using threads than by using a single thread? Thanks, Maurício

Brian Sniffen

10:18 p.m.

On Nov 28, 2007 5:07 PM, Maurício wrote:

...

Sorry if I sound rude. I just saw a place for a small joke, and used it. Chris code is pretty elegant to what it is supposed to do. However, knowing if a thread has finished is just 1 bit of information. There's probably a reason why that would hurt performance, but I don't understand it.

Most threads either communicate some result---and you'll care about setting up a channel for that---or run forever. Some threads run on different computation engines. There's nothing in the Haskell spec that says I have to run the threads on a shared-memory machine. If the threads are distributed, then the channel to communicate back that one has finished may be very expensive. -Brian -- Brian T. Sniffen bts@alum.mit.edu or brian.sniffen@gmail.com http://www.evenmere.org/~bts

david48

30 Nov 30 Nov

11:44 a.m.

On Nov 28, 2007 11:07 PM, Maurício wrote:

...

Sorry, I don't agree. I try to write things in a way that when you read it you can get an intuition on why it's doing what it's doing; even when the

That's what comment are for :)

...

generate. So, instead of checking if threads have finished, I decided to check if files exist and are available for writing. When I read 'takeMVar

Checking a file is non blocking, right ? So you have to loop until the file becomes available. taking a MVar, on the other hand, blocks your thread until the other one finishes, without using cpu time, etc.

...

know of a benchmark where the task is some kind of situation where you actually get a result faster by using threads than by using a single thread?

Threads won't give you a speedup unless you run the program on a multi-core/multi-proc machine. They help making the program simpler, IMHO.

Maurício

12:54 p.m.

...

...
Sorry, I don't agree. I try to write things in a way that when you read it you can get an intuition on why it's doing what it's doing; (...)

...

That's what comment are for :)

You're right. I should stop my habit of using a script to remove comments before reading other people code :)

...

...
generate. So, instead of checking if threads have finished, I decided to check if files exist and are available for writing. When I read 'takeMVar (...)

...

Checking a file is non blocking, right? (...)

Yes. That was a completely wrong idea. I just decided not use threads this time.

...

Threads won't give you a speedup unless you run the program on a multi-core/multi-proc machine. They help making the program simpler, IMHO.

Also agree. Properly threaded programs are a LOT easier to read. Maurício

Jules Bean

6 Dec 6 Dec

4:30 p.m.

david48 wrote:

...

Threads won't give you a speedup unless you run the program on a multi-core/multi-proc machine.

That's actually not true. Threads allow you managing your IO blocking better, and not making IO block your whole program can certainly speed it up by a couple of orders of magnitude.

...

They help making the program simpler, IMHO.

They can. They can make it more complex, too :) Jules

6430

Age (days ago)

6444

Last active (days ago)

List overview

Download

39 comments

21 participants

participants (21)

Adrian Hey
Andrew Coppin
Ben Franksen
Brad Clow
Brandon S. Allbery KF8NH
Brian Sniffen
Bryan O'Sullivan
ChrisK
Dan Weston
david48
Don Stewart
Henning Thielemann
Jules Bean
Ketil Malde
Matthew Brecknell
Maurício
Paul Moore
Robert Dockins
Ryan Ingram
Simon Marlow
Spencer Janssen