Performance of parallel mergesort

For some reason all traffic from this beginners list stopped reaching me after the 13th of December. Thanks for the response, Antoine. I've tried to install more recent versions of GHC (6.10 and 6.12) but I cannot get either of them to work at all. GHC 6.8 generates code that segfaults and the GHC 6.10 from Debian testing cannot compile anything for me. So I thought I'd try GHC 6.12. That isn't in any repo so I decided to build it from source. I uninstalled all other GHCs to give it a fresh start. The GHC page says "Stop, install the Haskell Platform" but the Haskell Platform says you must install GHC first, so I've got a nice circular dependency right from the start! So I tried installing GHC 6.12 from source but that requires GHC to be installed. So I installed GHC 6.8 (the one that segfaults) using apt again from Debian. That seems to have built a GHC 6.12 that I can run from the command line but it cannot compile that program because it doesn't have parallel stuff. So I thought I'd install the Haskell Platform. Unfortunately, the Haskell Platform configure script gives the nonsensical error that I must "upgrade" from GHC 6.12 down to GHC 6.10. I was getting pretty run down by this point so I just told it to sod off using the --enable-unsupported-ghc-version command line option. The Haskell Platform's configure script then completed but building it failed with: [21 of 21] Compiling Graphics.UI.GLUT ( Graphics/UI/GLUT.hs, dist/build/Graphics/UI/GLUT.p_o ) Registering GLUT-2.1.1.2... Setup: GLUT-2.1.1.2: dependency "OpenGL-2.2.1.1-182b091280ce0de861295bc592bae77c" doesn't exist (use --force to override) Error: Building the GLUT-2.1.1.2 package failed make: *** [build.stamp] Error 2 I have glut and use it all the time from OCaml without a hitch so I've no idea what the problem is here. Oh well, I just discovered that installing GHC 6.12 has at least fixed GHC 6.10 so it can now compile and run that mergesort. Oh FFS, spoke too soon: $ time ./mergesort +RTS -N8 Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it. real 0m38.801s user 4m0.851s sys 0m1.308s -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e

On Thu, Dec 24, 2009 at 2:24 AM, Jon Harrop
For some reason all traffic from this beginners list stopped reaching me after the 13th of December.
Thanks for the response, Antoine. I've tried to install more recent versions of GHC (6.10 and 6.12) but I cannot get either of them to work at all.
GHC 6.8 generates code that segfaults and the GHC 6.10 from Debian testing cannot compile anything for me. So I thought I'd try GHC 6.12. That isn't in any repo so I decided to build it from source. I uninstalled all other GHCs to give it a fresh start. The GHC page says "Stop, install the Haskell Platform" but the Haskell Platform says you must install GHC first, so I've got a nice circular dependency right from the start!
So I tried installing GHC 6.12 from source but that requires GHC to be installed. So I installed GHC 6.8 (the one that segfaults) using apt again from Debian. That seems to have built a GHC 6.12 that I can run from the command line but it cannot compile that program because it doesn't have parallel stuff. So I thought I'd install the Haskell Platform.
The Haskell Platform is currently the recommended way to go, but it won't be up on GHC 6.12 until some time in the new year (I don't know their timelines). For me, I didn't have a working GHC 6.12 setup until this morning. The compiler is ready, but we're still working on moving the libraries over to the new major version. Maybe compiling your own 6.10.4 would be the best choice until the next platform release. If you're willing to dive in, the version of cabal-install that works for GHC 6.12.1 is out now, but it does require some boot-strapping to get going, as GHC no longer ships libraries that aren't required to build GHC. But back to the GHC panic, it might be worth posting that error to the GHC-users list I mentioned earlier. Antoine

On Thursday 24 December 2009 02:55:33 you wrote:
The Haskell Platform is currently the recommended way to go, but it won't be up on GHC 6.12 until some time in the new year (I don't know their timelines).
I see, thanks.
For me, I didn't have a working GHC 6.12 setup until this morning. The compiler is ready, but we're still working on moving the libraries over to the new major version.
Maybe compiling your own 6.10.4 would be the best choice until the next platform release.
Is that better than using the 6.10.4 from Debian (and the haskell-platform package)?
If you're willing to dive in, the version of cabal-install that works for GHC 6.12.1 is out now, but it does require some boot-strapping to get going, as GHC no longer ships libraries that aren't required to build GHC.
Yeah, I tried installing cabal from source as well but couldn't get it to work either.
But back to the GHC panic, it might be worth posting that error to the GHC-users list I mentioned earlier.
I cannot reproduce it with GHC 6.10 so it has probably long since been fixed (6.8 is two years old!). -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e

The mergesort example given here has a rather nasty problem – it creates
sparks for mergeing the two lists at the same time as it creates the lists,
so there's always at least one task blocking waiting for it's producers.
This variation on a theme goes roughly twice as fast for me with -N1, and
*does* get a speedup from -N<somethinghigher>:
mergesort [] = []
mergesort [x] = [x]
mergesort [x,y] = if x < y then [x,y] else [y,x]
mergesort xs = (forceList sleft) `par`
(forceList sright) `pseq`
merge sleft sright
where
(left,right) = split (length xs `div` 2) xs
sleft = mergesort left
sright = mergesort right
Note the `pseq` not `par` before the merge. I also added one more
short-circuit case – we want to avoid spawning threads for tiny tasks like
sorting 1 element lists.
Here's my results on a dual core running OS X:
LappyBob:~ tatd2$ time ./test +RTS -N1
-0.117109518058233
real 0m4.608s
user 0m4.400s
sys 0m0.189s
LappyBob:~ tatd2$ time ./test +RTS -N2
-0.117109518058233
real 0m3.648s
user 0m6.360s
sys 0m0.220s
LappyBob:~ tatd2$ time ./test +RTS -N3
-0.117109518058233
real 0m50.679s
user 1m24.235s
sys 0m0.620s
The last result is still off the wall, but it's a lot better.
Bob
On Thu, Dec 24, 2009 at 2:24 AM, Jon Harrop
For some reason all traffic from this beginners list stopped reaching me after the 13th of December.
Thanks for the response, Antoine. I've tried to install more recent versions of GHC (6.10 and 6.12) but I cannot get either of them to work at all.
GHC 6.8 generates code that segfaults and the GHC 6.10 from Debian testing cannot compile anything for me. So I thought I'd try GHC 6.12. That isn't in any repo so I decided to build it from source. I uninstalled all other GHCs to give it a fresh start. The GHC page says "Stop, install the Haskell Platform" but the Haskell Platform says you must install GHC first, so I've got a nice circular dependency right from the start!
So I tried installing GHC 6.12 from source but that requires GHC to be installed. So I installed GHC 6.8 (the one that segfaults) using apt again from Debian. That seems to have built a GHC 6.12 that I can run from the command line but it cannot compile that program because it doesn't have parallel stuff. So I thought I'd install the Haskell Platform.
Unfortunately, the Haskell Platform configure script gives the nonsensical error that I must "upgrade" from GHC 6.12 down to GHC 6.10. I was getting pretty run down by this point so I just told it to sod off using the --enable-unsupported-ghc-version command line option. The Haskell Platform's configure script then completed but building it failed with:
[21 of 21] Compiling Graphics.UI.GLUT ( Graphics/UI/GLUT.hs, dist/build/Graphics/UI/GLUT.p_o ) Registering GLUT-2.1.1.2... Setup: GLUT-2.1.1.2: dependency "OpenGL-2.2.1.1-182b091280ce0de861295bc592bae77c" doesn't exist (use --force to override)
Error: Building the GLUT-2.1.1.2 package failed make: *** [build.stamp] Error 2
I have glut and use it all the time from OCaml without a hitch so I've no idea what the problem is here.
Oh well, I just discovered that installing GHC 6.12 has at least fixed GHC 6.10 so it can now compile and run that mergesort. Oh FFS, spoke too soon:
$ time ./mergesort +RTS -N8 Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize' to increase it.
real 0m38.801s user 4m0.851s sys 0m1.308s
-- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

On Thu, Dec 24, 2009 at 10:09 AM, Tom Davie
The mergesort example given here has a rather nasty problem – it creates sparks for mergeing the two lists at the same time as it creates the lists, so there's always at least one task blocking waiting for it's producers.
Don't worry too much about this thread, here is why : - Jon Harrop is definitely not a beginner in Haskell land and has no business asking this question on this list - Jon Harrop is very well known on established functional programming mailing lists and forum as a very disagreeable troll (not only on Haskell). He love to trash whatever language hasn't his favours at the moment (I think right now he likes F#, which is a very interesting language nonetheless). - He apparently made an effort to include approximately all elements that could hurt Haskell reputation in this thoughtful gift to the community in this Christmas period : ** As you pointed the initial code he chose doesn't parallelize very well basically ** He tried to compile it with GHC 6.8 which was an important step towards better internals to handle parallelism in GHC but was also a time of big upheaval in this field and so not as stable as one could wish. ** He then installed GHC 6.10 and did not even clean up his previous compilations attempts before declaring that it didn't work (the error message he cited is usually observed when GHC encounters an interface file created by a previous version of GHC or on another architecture) at all without testing on some other clean sources. ** He then apparently decided that using a binary package for the 6.12 version was beneath its dignity and went on to install it from source (generally not recommended except for GHC developers that needs to work on the code or for port to unusual architectures) and claimed that there was a "circularity problem" since its 6.10 could not compile it (which it obviously could, as he discovered and qualified as "miraculous", not even considering that he had a working 6.8 before). I could choose to believe that Harrop really is as incompetent as he sounds and really did encounter all those problems (which are all possibles if unlikely to be met in quick succession) but the simple fact that he decided to post this message on the Haskell-Beginner list instead of the Haskell-cafe list where he's already well known and an occasional poster makes it that much more unlikely. The other interpretation is that JdH is up to his usual tricks, just sinking to a new low by posting in a venue where he's less likely to be recognized and more likely to worry unwary would-be-haskell-beginners. Merry Christmas ! -- Jedaï

On Thursday 24 December 2009 13:24:30 you wrote:
** As you pointed the initial code he chose doesn't parallelize very well basically
I Googled for "parallel mergesort haskell" and found that code in lecture notes by an associate university professor who has pages of published papers on his specialist topic of parallel programming in Haskell. I had no idea it was bad code.
** He tried to compile it with GHC 6.8 which was an important step towards better internals to handle parallelism in GHC but was also a time of big upheaval in this field and so not as stable as one could wish.
I didn't know that.
** He then installed GHC 6.10 and did not even clean up his previous compilations attempts before declaring that it didn't work (the error message he cited is usually observed when GHC encounters an interface file created by a previous version of GHC or on another architecture) at all without testing on some other clean sources.
Looks like 6.12 fixed that problem.
** He then apparently decided that using a binary package for the 6.12 version was beneath its dignity and went on to install it from source (generally not recommended except for GHC developers that needs to work on the code or for port to unusual architectures)
I had no idea that building GHC from source was such a big deal. I'll use the binaries.
and claimed that there was a "circularity problem" since its 6.10 could not compile it (which it obviously could, as he discovered and qualified as "miraculous", not even considering that he had a working 6.8 before).
The circularity problem is that the GHC and Haskell Platform download pages both say that you must install the other first: "Stop! For most users, we recommend installing the Haskell Platform instead of GHC." - http://www.haskell.org/ghc/download_ghc_6_12_1.html "You only need GHC installed to get started:" - http://hackage.haskell.org/platform/ I wasn't sure how to proceed so I guess and apparently got it wrong. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e

On Thursday 24 December 2009 09:09:41 you wrote:
The mergesort example given here has a rather nasty problem – it creates sparks for mergeing the two lists at the same time as it creates the lists, so there's always at least one task blocking waiting for it's producers.
I see. So the sparks either contend for locks on the thunks to force that list or they repeat the same work.
This variation on a theme goes roughly twice as fast for me with -N1, and *does* get a speedup from -N<somethinghigher>:
mergesort [] = [] mergesort [x] = [x] mergesort [x,y] = if x < y then [x,y] else [y,x] mergesort xs = (forceList sleft) `par` (forceList sright) `pseq` merge sleft sright where (left,right) = split (length xs `div` 2) xs sleft = mergesort left sright = mergesort right
Note the `pseq` not `par` before the merge.
Can you explain why that helps?
I also added one more short-circuit case – we want to avoid spawning threads for tiny tasks like sorting 1 element lists.
This is something that concerns me. Lots of discussions of parallelism, including the Haskell literature I have read, neglect this critical problem of making sure that more time is spent doing useful work than spawning tasks (sparks). How is this solved in Haskell? Do you just put magic numbers in that work on the machine you're currently using? Also, how much work does a Haskell thread usually have to do to make it worth sparking?
Here's my results on a dual core running OS X:
LappyBob:~ tatd2$ time ./test +RTS -N1 -0.117109518058233
real 0m4.608s user 0m4.400s sys 0m0.189s LappyBob:~ tatd2$ time ./test +RTS -N2 -0.117109518058233
real 0m3.648s user 0m6.360s sys 0m0.220s LappyBob:~ tatd2$ time ./test +RTS -N3 -0.117109518058233
real 0m50.679s user 1m24.235s sys 0m0.620s
The last result is still off the wall, but it's a lot better.
That's another question: why did you get such awful performance for N=3 and I get similar results for N=8 on this 8 core? -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e

On Thursday 24 December 2009 09:09:41 Tom Davie wrote:
Here's my results on a dual core running OS X:
LappyBob:~ tatd2$ time ./test +RTS -N1 -0.117109518058233
real 0m4.608s user 0m4.400s sys 0m0.189s LappyBob:~ tatd2$ time ./test +RTS -N2 -0.117109518058233
real 0m3.648s user 0m6.360s sys 0m0.220s LappyBob:~ tatd2$ time ./test +RTS -N3 -0.117109518058233
real 0m50.679s user 1m24.235s sys 0m0.620s
Here are my results on a dual quad-core: $ ghc-6.10.4 --make -O2 -threaded mergesort.hs -o mergesort [1 of 1] Compiling Main ( mergesort.hs, mergesort.o ) Linking mergesort ... jdh30@leper:~/Programs/mine/haskell/mergesort$ time ./mergesort +RTS -K100000000 +RTS -N1 -0.117109518058233 real 0m7.664s user 0m7.528s sys 0m0.136s $ time ./mergesort +RTS -K100000000 +RTS -N2 -0.117109518058233 real 0m7.224s user 0m9.773s sys 0m0.108s $ time ./mergesort +RTS -K100000000 +RTS -N3 -0.117109518058233 real 0m7.293s user 0m11.637s sys 0m0.168s $ time ./mergesort +RTS -K100000000 +RTS -N4 -0.117109518058233 real 0m7.333s user 0m12.689s sys 0m0.360s $ time ./mergesort +RTS -K100000000 +RTS -N5 -0.117109518058233 real 0m7.449s user 0m14.757s sys 0m0.548s $ time ./mergesort +RTS -K100000000 +RTS -N6 -0.117109518058233 real 0m7.043s user 0m14.985s sys 0m0.252s $ time ./mergesort +RTS -K100000000 +RTS -N7 -0.117109518058233 real 0m8.018s user 0m19.325s sys 0m1.012s $ time ./mergesort +RTS -K100000000 +RTS -N8 -0.117109518058233 real 0m55.963s user 5m55.654s sys 0m1.492s As you can see, there is still no significant speedup from parallelism. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
participants (4)
-
Antoine Latter
-
Chaddaï Fouché
-
Jon Harrop
-
Tom Davie