
Jesper Louis Andersen wrote:
On Thu, Mar 4, 2010 at 8:35 PM, Neil Brown
wrote: CML is indeed the library that has the most markedly different behaviour. In Haskell, the CML package manages to produce timings like this for fairly simple benchmarks:
%GC time 96.3% (96.0% elapsed)
I knew from reading the code that CML's implementation would do something like this, although I do wonder if it triggers some pathological case in the GC.
That result is peculiar. What are you doing to the library, and what do you expect happens? Since I have some code invested on top of CML, I'd like to gain a little insight if possible.
In trying to simplify my code, the added time has moved from GC time to EXIT time (and increased!). This shift isn't too surprising -- I believe the time is really spent trying to kill lots of threads. Here's my very simple benchmark; the main thread repeatedly chooses between receiving from two threads that are sending to it: ==== import Control.Concurrent import Control.Concurrent.CML import Control.Monad main :: IO () main = do let numChoices = 2 cs <- replicateM numChoices channel mapM_ forkIO [replicateM_ (100000 `div` numChoices) $ sync $ transmit c () | c <- cs] replicateM_ 100000 $ sync $ choose [receive c (const True) | c <- cs] ==== Compiling with -threaded, and running with +RTS -s, I get: INIT time 0.00s ( 0.00s elapsed) MUT time 2.68s ( 3.56s elapsed) GC time 1.84s ( 1.90s elapsed) EXIT time 89.30s ( 90.71s elapsed) Total time 93.82s ( 96.15s elapsed) I think the issue with the CML library is that it spawns a lot of threads (search the source for forkIO: http://hackage.haskell.org/packages/archive/cml/0.1.3/doc/html/src/Control-C...). Presumably the Haskell RTS isn't optimised for this approach (maybe the ML RTS was, from what you said?), and at the end of the program it spends a lot of time reaping the threads. The problem isn't nearly as bad if you don't use choose, though: ==== import Control.Concurrent import Control.Concurrent.CML import Control.Monad main :: IO () main = do c <- channel forkIO $ replicateM_ 100000 $ sync $ transmit c () replicateM_ 100000 $ sync $ receive c (const True) ==== I get: INIT time 0.00s ( 0.00s elapsed) MUT time 1.92s ( 2.65s elapsed) GC time 0.92s ( 0.93s elapsed) EXIT time 0.00s ( 0.02s elapsed) Total time 2.65s ( 3.59s elapsed) %GC time 34.6% (25.8% elapsed) Hope that helps, Neil.