To give an idea of why CPS transform can sometimes be more efficient, consider the following two approaches to a "safe divide" function, which check if the denominator is 0:
divMay :: Double -> Double -> Maybe Double
divMay x y
| y == 0 = Nothing
| otherwise = Just (x / y)
divCPS :: Double -> Double -> a -> (Double -> a) -> a
divCPS x y onFail onSuccess
| y == 0 = onFail
| otherwise = onSuccess (x / y)
Presumably, divMay will always force the allocation of a new object when y is not 0, through its usage of Just. divCPS, on the other hand, does not need to perform any allocation. I say presumably, because often times the compiler will be able to optimize this. To see evidence of that, I've put together a small benchmark[1], If I compile with optimizations turned of (-O0), I get results showing that CPS is faster. And looking at the core[2], we can see that divMay really does result in an extra allocation:
$ ghc-core --no-cast -- -O0 foo.hs
divMay'_r454 :: Double -> Double
divMay'_r454 =
\ (y_a4hf :: Double) ->
Data.Maybe.fromMaybe
@ Double
(D# 0.0)
(case ==
@ Double $fEqDouble y_a4hf (D# 0.0)
of _ [Occ=Dead] {
False ->
Data.Maybe.Just
@ Double
(/
@ Double
$fFractionalDouble
(D# 10.0)
y_a4hf);
True -> Data.Maybe.Nothing @ Double
})
divCPS'_r455 :: Double -> Double
divCPS'_r455 =
\ (y_a4hg :: Double) ->
case ==
@ Double $fEqDouble y_a4hg (D# 0.0)
of _ [Occ=Dead] {
False ->
id
@ Double
(/
@ Double
$fFractionalDouble
(D# 10.0)
y_a4hg);
True -> D# 0.0
}
Yes, core has a lot of stuff going on due to the explicit type signatures, the "magic hash" unboxed types showing up, etc. But as you can see, there's a call to Data.Maybe.Just in divMay' that is not present in divCPS'.
Now let's explain that "presumably" comment. GHC is really smart, and if you let it, it will often times optimize away these kinds of things, especially in small examples like this. In this case, the benchmark results show up as almost identical between the two versions, which is hardly surprising when you look at the core:
$ ghc-core --no-cast -- -O2 foo.hs
divMay'_r4ci :: Double -> Double
divMay'_r4ci =
\ (y_a4oE :: Double) ->
case y_a4oE of _ [Occ=Dead] { D# x_a4Ml ->
case x_a4Ml of wild1_X10 {
__DEFAULT ->
case /## 10.0 wild1_X10 of wild2_a4Rj { __DEFAULT ->
D# wild2_a4Rj
};
0.0 -> main9
}
}
divCPS'_r4cj :: Double -> Double
divCPS'_r4cj =
\ (y_a4oF :: Double) ->
case y_a4oF of _ [Occ=Dead] { D# x_a4Ml ->
case x_a4Ml of wild1_X15 {
__DEFAULT ->
case /## 10.0 wild1_X15 of wild2_a4Rj { __DEFAULT ->
D# wild2_a4Rj
};
0.0 -> main9
}
}
If you look past the different variable names, you'll see that the two functions are in fact identical. So to sum up this (longer than expected) email:
* CPS allows you to avoid extra allocations,
* but often times GHC can perform that optimization for you itself.
You can look at the attoparsec codebase to see its usage of CPS, which is a more complicated usage of this same concept. Continuation passing web frameworks are a totally different story though, and have been covered pretty well elsewhere (probably best by Alberto[3], though others may have different links).
[2] Core is a lower-level form that GHC compiles code to before generating machine code. It gives a very good idea of what optimizations have been applied. I used the ghc-core tool for this. Gabriel Gonzalez wrote a nice blog post about this a few years back:
http://www.haskellforall.com/2012/10/hello-core.html
On Thu Jan 01 2015 at 6:00:12 AM Cary Cherng <
ccherng@gmail.com> wrote:
I read (http://en.wikibooks.org/wiki/Haskell/Continuation_passing_style)
that in some circumstances, CPS can be used to improve performance by
eliminating certain construction-pattern matching sequences (i.e. a
function returns a complex structure which the caller will at some
point deconstruct). And that attoparsec is an example of this.
I don't see exactly how CPS gives rise to concrete examples of
performance gains. Moreover how does this arise in parsing for example
with attoparsec as mentioned.
I also encountered various web frameworks such as mflow that are based
on continuations. How a typical http restful system is made into
something based around continuations is not something that is obvious
to me. And what is gained from that?
_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners