2009/8/21 Don Stewart <dons@galois.com>

paolo.veronelli:

> Hi, reading a previous thread I got interested.
> I simplified the example pointed by dons in
>
> import Control.Parallel
>
> main = a `par` b `pseq` print (a + b )
> where
> a = ack 3 11
> b = ack 3 11
>
> ack 0 n = n+1
> ack m 0 = ack (m-1) 1
> ack m n = ack (m-1) (ack m (n-1))
>
> compiled with
> ghc --make prova -O2 -threaded
>
> timings
> paolino@paolino-casa:~$ time ./prova +RTS -N1
> 32762
>
> real 0m7.031s
> user 0m6.304s
> sys 0m0.004s
> paolino@paolino-casa:~$ time ./prova +RTS -N2
> 32762
>
> real 0m6.997s
> user 0m6.728s
> sys 0m0.020s
> paolino@paolino-casa:~$
>
> without optimizations it gets worse
>
> paolino@paolino-casa:~$ time ./prova +RTS -N1
> 32762
>
> real 1m20.706s
> user 1m18.197s
> sys 0m0.104s
> paolino@paolino-casa:~$ time ./prova +RTS -N2
> 32762
>
> real 1m38.927s
> user 1m45.039s
> sys 0m0.536s
> paolino@paolino-casa:~$
>
> staring at the resource usage graph I can see it does use 2 cores when told to
> do it, but with -N1 the used cpu goes 100% and with -N2 they both run just over
> 50%
>
> thanks for comments

Firstly, a and b are identical, so GHC commons them up. The compiler
transforms it into:

a `par` a `seq` print (a + a)

So you essentially fork a spark to evaluate 'a', and then have the main
thread also evaluate 'a' again. One of them wins, then you add the
result to itself. The runtime may choose not to convert your first spark
into a thread.

Running with a 2009 GHC head snapshot, we can see with +RTS -sstderr

SPARKS: 1 (0 converted, 0 pruned)

That indeed, it doesn't convert your `par` into a real thread.

While, for example, the helloworld on the wiki:

http://haskell.org/haskellwiki/Haskell_in_5_steps

Converts 2 sparks to 2 theads:

SPARKS: 2 (2 converted, 0 pruned)
./B +RTS -threaded -N2 -sstderr 2.13s user 0.04s system 137% cpu 1.570 total

-- Don