
Hi all, In "DEFUN 2009: Multicore Programming in Haskell Now!" (http://donsbot.wordpress.com/2009/09/05/defun-2009-multicore-programming-in-...), slide 30 I see: Don't “accidentally parallelize”: – f `par` f + e and that the correct way of achieving parallelism is: – f `par` e `pseq` f + e Actually I don't understand the difference between these two forms. Could any brave soul explain it to me, please? As a bonus question: what is the difference between `seq` and `pseq`? -- Gracjan

On Sat, Sep 05, 2009 at 11:18:24AM +0000, Gracjan Polak wrote:
Hi all,
In "DEFUN 2009: Multicore Programming in Haskell Now!" (http://donsbot.wordpress.com/2009/09/05/defun-2009-multicore-programming-in-...), slide 30 I see:
Don't “accidentally parallelize”: – f `par` f + e
This creates a spark (potential speculative execution) to evaluate 'f', but whether this actually gets instantiated in another thread depends on the order in which the main thread evaluates (f + e): if we get lucky and it decides to work on evaluating 'e' first, then another thread may get a chance to evaluate 'f' in parallel. But if the main thread decides to work on 'f' first then the spark to evaluate 'f' will never get run and we end up with a sequential computation.
and that the correct way of achieving parallelism is: – f `par` e `pseq` f + e
This means: create a spark to evaluate 'f', then evaluate 'e', and then finally evaluate f + e. This ensures that the main thread will work on 'e' first so that the spark for 'f' has a chance to run in parallel.
As a bonus question: what is the difference between `seq` and `pseq`?
x `pseq` y guarantees to evaluate x before y. There is no such guarantee with x `seq` y; the only guarantee with `seq` is that x `seq` y will be _|_ if x is. -Brent

This could be a way to parallelize code (which would prevent such mistakes): newtype Par a = Par { doPar :: a } instance Functor Par where fmap = liftA instance Applicative Par where pure = Par Par f <*> Par x = Par $ f `par` x `pseq` f x then instead of: fun `par` arg1 `par` arg2 `pseq` fun arg1 arg2 you can write: doPar $ fun <$> pure arg1 <*> pure arg2 Sjoerd On Sep 5, 2009, at 2:39 PM, Brent Yorgey wrote:
On Sat, Sep 05, 2009 at 11:18:24AM +0000, Gracjan Polak wrote:
Hi all,
In "DEFUN 2009: Multicore Programming in Haskell Now!" (http://donsbot.wordpress.com/2009/09/05/defun-2009-multicore-programming-in-... ), slide 30 I see:
Don't “accidentally parallelize”: – f `par` f + e
This creates a spark (potential speculative execution) to evaluate 'f', but whether this actually gets instantiated in another thread depends on the order in which the main thread evaluates (f + e): if we get lucky and it decides to work on evaluating 'e' first, then another thread may get a chance to evaluate 'f' in parallel. But if the main thread decides to work on 'f' first then the spark to evaluate 'f' will never get run and we end up with a sequential computation.
and that the correct way of achieving parallelism is: – f `par` e `pseq` f + e
This means: create a spark to evaluate 'f', then evaluate 'e', and then finally evaluate f + e. This ensures that the main thread will work on 'e' first so that the spark for 'f' has a chance to run in parallel.
As a bonus question: what is the difference between `seq` and `pseq`?
x `pseq` y guarantees to evaluate x before y. There is no such guarantee with x `seq` y; the only guarantee with `seq` is that x `seq` y will be _|_ if x is.
-Brent _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Sjoerd Visscher sjoerd@w3future.com

Thanks for great response!
Brent Yorgey
x `pseq` y guarantees to evaluate x before y. There is no such guarantee with x `seq` y; the only guarantee with `seq` is that x `seq` y will be _|_ if x is.
I found an old thread here http://www.mail-archive.com/glasgow-haskell-users@haskell.org/msg11022.html where Simon states [quote] Indeed, if GHC was in the habit of causing the second argument of seq to be evaluated before the first, then a lot of people would probably be surprised. eg. imagine what happens to foldl': foldl' f a [] = a foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs It wouldn't do what you want at all. [/quote] So... seems foldl' relies on `seq` having unstated evaluation order in GHC. So, what guarantees does foldl' have in turn? Semantics only or operational? Shouldn't it be written using `pseq`? Seems I have always used (this `seq` that) when I meant (this `before` that). Is it time to revisit my code and use `pseq` more? What does Haskell' say about this? -- Gracjan

On Saturday 05 September 2009 9:13:50 am Gracjan Polak wrote:
[quote] Indeed, if GHC was in the habit of causing the second argument of seq to be evaluated before the first, then a lot of people would probably be surprised. eg. imagine what happens to foldl':
foldl' f a [] = a foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs
It wouldn't do what you want at all. [/quote]
So... seems foldl' relies on `seq` having unstated evaluation order in GHC. So, what guarantees does foldl' have in turn? Semantics only or operational? Shouldn't it be written using `pseq`?
Seems I have always used (this `seq` that) when I meant (this `before` that). Is it time to revisit my code and use `pseq` more? What does Haskell' say about this?
I suppose technically, what foldl' has over foldl is that it is more readily subject to optimization. Each recursive call is artificially made strict in the accumulator, so it is legal for GHC to optimize the function by keeping the accumulator evaluated, instead of delaying it. When GHC is run with optimizations on, it does analysis on all code that tries to determine such things, and seq can be seen as making such analysis easier for the compiler. This is, of course, not what really happens in GHC. What really happens is that the first argument to seq is evaluated before the second (which is why it even has the intended effect when optimizations aren't on). But that doesn't have to be the case, strictly speaking. -- Dan

On Sat, Sep 5, 2009 at 7:57 PM, Dan Doel
I suppose technically, what foldl' has over foldl is that it is more readily subject to optimization. Each recursive call is artificially made strict in the accumulator, so it is legal for GHC to optimize the function by keeping the accumulator evaluated, instead of delaying it. When GHC is run with optimizations on, it does analysis on all code that tries to determine such things, and seq can be seen as making such analysis easier for the compiler.
It turns out, pseq limits the effectiveness of strictness analysis, because it forces the order of evaluation. John Meacham described this pretty well last week in the Haskell' list http://www.haskell.org/pipermail/haskell-prime/2009-August/003006.html.
This is, of course, not what really happens in GHC. What really happens is that the first argument to seq is evaluated before the second (which is why it even has the intended effect when optimizations aren't on). But that doesn't have to be the case, strictly speaking.
It's entirely possible for optimized code to end up evaluating the
second argument to seq before the first.
--
Dave Menendez

On Sunday 06 September 2009 2:18:31 am David Menendez wrote:
On Sat, Sep 5, 2009 at 7:57 PM, Dan Doel
wrote: I suppose technically, what foldl' has over foldl is that it is more readily subject to optimization. Each recursive call is artificially made strict in the accumulator, so it is legal for GHC to optimize the function by keeping the accumulator evaluated, instead of delaying it. When GHC is run with optimizations on, it does analysis on all code that tries to determine such things, and seq can be seen as making such analysis easier for the compiler.
It turns out, pseq limits the effectiveness of strictness analysis, because it forces the order of evaluation. John Meacham described this pretty well last week in the Haskell' list http://www.haskell.org/pipermail/haskell-prime/2009-August/003006.html.
Interesting. I hadn't thought of this before, but it certainly makes sense.
This is, of course, not what really happens in GHC. What really happens is that the first argument to seq is evaluated before the second (which is why it even has the intended effect when optimizations aren't on). But that doesn't have to be the case, strictly speaking.
It's entirely possible for optimized code to end up evaluating the second argument to seq before the first.
Yeah, I suppose I should have been more precise. In the absence of optimizations, I assume seq translates to core something like: seq x y = case x of { z -> y } where the core version of case evaluates things regardless of patterns and such. That explains why foldl' works even if you don't compile it with optimizations. But yes, it wouldn't surprise me if the optimizer rearranged the above, since the case statement is a no-op other than forcing x, and it might see ways to better evaluate things without altering the results. By contrast, pseq would be like the above, but the optimizer would be unable to rewrite it. Perhaps that's still misleading, though. The difference between them is rather like the difference between: xor False False = False xor False True = True xor True False = True xor True True = False (verbose definition meant to emphasize the symmetry of the arguments) and or True _ = True or False b = b xor is strict in both its arguments, whereas or is strict in its first argument, and only strict in its second argument if its first argument is False. Similarly, seq is meant to be strict in both its arguments, whereas pseq needs to be strict in its first argument, and only strict in its second argument if its first argument is non-bottom. -- Dan

Dan Doel
On Sunday 06 September 2009 2:18:31 am David Menendez wrote:
It turns out, pseq limits the effectiveness of strictness analysis, because it forces the order of evaluation. John Meacham described this pretty well last week in the Haskell' list http://www.haskell.org/pipermail/haskell-prime/2009-August/003006.html.
Interesting. I hadn't thought of this before, but it certainly makes sense.
Thank to all of you! This thread is fascinating! :) I used `seq` to duct tape my space leaks and stack overflow issues. Now looked for `pseq` for parallelism. And I think I understand enough to use both reasonably. Thanks! Gracjan
participants (5)
-
Brent Yorgey
-
Dan Doel
-
David Menendez
-
Gracjan Polak
-
Sjoerd Visscher