Re: [Haskell-cafe] Parallel weirdness

19 Apr 2008

      Hello Andrew,

Saturday, April 19, 2008, 6:56:10 PM, you wrote:
...
OK, so just for fun, I decided to try implementing a parallel merge sort
coincedence - now i'm writing a parallel compression algorithm, very
much like parallel bzip2, but using ghc, of course
...
Weird thing #1: The first time you sort the data, it takes a few
seconds. The other 7 times, it takes a split second - roughly 100x 
faster. Wuh?
this looks like disk caching effects. if data are read from disj on
first run and from disk cache on the next runs, this only means that
your algorithm works faster than reading its data from disk
...
Weird thing #2: The parallel version runs *faster* than the sequential
one in all cases - even with SMP disabled! (We're only talking a few 
percent faster, but still.)
...
Weird thing #3: Adding the "-threaded" compiler option makes
*everything* run a few percent faster. Even with only 1 OS thread.
there are plenty of reasons: first, -threaded make i/o overlapped
with calculations. second, parallel version may exhibit better cpu
cache behavior - such as processing all data in cache before sending
it back to memory
...
Weird thing #4: Adding "-N2" makes *everything* slow down a few percent.
In particular, Task Manager shows only one CPU core in use.
it's easy - your algorithm isn't really parallel, and you just forced
ghc to move it from one core to another. it's overhead of moving data
around :)
...
Can anybody explain any of this behaviour? I have no idea what I'm 
benchmarking, but it certainly doesn't appear to be the performance of a
parallel merge sort!
there are many subtle effects making optimization much more interesting
than using simple schemas ;)  it's why i like it so much :))

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin@gmail.com

Re: [Haskell-cafe] Parallel weirdness

Bulat Ziganshin