Experiments with defunctionalization, church-encoding and CPS

I took a toy problem - find the first node satisfying a predicate in a binary tree, started with a naive Maybe-based implementation - and experimented with 3 ways of changing the program: - Church-encode the Maybe - Convert the program into CPS - Defunctionalize the Church-encoded or CPS-transformed program http://hpaste.org/fastcgi/hpaste.fcgi/view?id=10686 The link points to code, a benchmark and conclusion. Conclusion: - Haskell implements Maybe well enough that it is not possible to do better - Defunctionalization and consequent optimization yields same performance as the one with Maybe - Non-simplified CPS and Church-encoded representations do bad -- Eugene Kirpichov Web IR developer, market.yandex.ru

Eugene Kirpichov wrote:
I took a toy problem - find the first node satisfying a predicate in a binary tree, started with a naive Maybe-based implementation - and experimented with 3 ways of changing the program: - Church-encode the Maybe - Convert the program into CPS - Defunctionalize the Church-encoded or CPS-transformed program http://hpaste.org/fastcgi/hpaste.fcgi/view?id=10686
The link points to code, a benchmark and conclusion.
Conclusion: - Haskell implements Maybe well enough that it is not possible to do better - Defunctionalization and consequent optimization yields same performance as the one with Maybe - Non-simplified CPS and Church-encoded representations do bad
I'm not sure your example program is enough to justify the conclusions. The main advantage of the Church-encoding of Maybe type Maybe' a = forall b . b -> (a -> b) -> b is that the latter has the pattern match for failure / success built-in when used as a monad m >>= g = \nothing just -> m nothing (\x -> g x nothing just) VS m >>= g = case m of Nothing -> Nothing Just x -> g x The algebraic data type has to check that the result was indeed Just x whereas the Church-encoding can jump right to the success continuation. The problem is that your tree doesn't really make use of this advantage. For that, you need a completely left-biased tree, like for example Fork 1 (Fork 2 (Fork 3 ... Leaf) Leaf) Leaf testTree = fromList [1..100000] fromList = foldr (\x t -> Fork x t Leaf) Leaf I've implemented this and posted it below your version at http://hpaste.org/fastcgi/hpaste.fcgi/view?id=10686 I've used the criterion benchmarking package, so you can run this straight out of the box. Even then, the results are mixed. The Church-encoding shines in GHCi as it should, but loses its advantage when the code is being compiled. I guess we have to look at the core if we want to know what exactly is going on. PS: I'm not entirely sure I'm using criterion correctly, in particular concerning strictness. For instance, dropping the fromJust will reduce the run-time of the "data Maybe" benchmark by 75%. Comments? Regards, apfelmus -- http://apfelmus.nfshost.com

On Sun, Nov 1, 2009 at 7:12 AM, Heinrich Apfelmus
Even then, the results are mixed. The Church-encoding shines in GHCi as it should, but loses its advantage when the code is being compiled. I guess we have to look at the core if we want to know what exactly is going on.
What optimization level did you compile with?
--
Dave Menendez

David Menendez wrote:
On Sun, Nov 1, 2009 at 7:12 AM, Heinrich Apfelmus
wrote: Even then, the results are mixed. The Church-encoding shines in GHCi as it should, but loses its advantage when the code is being compiled. I guess we have to look at the core if we want to know what exactly is going on.
What optimization level did you compile with?
No optimization. From the bottom of the paste http://hpaste.org/fastcgi/hpaste.fcgi/view?id=10686#a11420 Results: In ghci, the church encodings clearly win. *Main> force testTree () (1.63 secs, 13948748 bytes) *Main> test findMaybe 0 100000 (1.70 secs, 15026356 bytes) *Main> test findChurch 0 100000 (0.72 secs, 15553668 bytes) *Main> test findChurch' 0 100000 (0.71 secs, 13456600 bytes) But when compiling, the algebraic data types wins. ghc --make Test.hs data Maybe mean: 90.08407 ms, lb 86.36974 ms, ub 94.13646 ms Church encoding mean: 152.0198 ms, lb 144.0916 ms, ub 161.0063 ms Church encoding optimised mean: 114.0715 ms, lb 107.3498 ms, ub 122.6881 ms I am a bit surprised. Then again, I probably shouldn't be surprised that the cost model is not like I imagine it to be. Regards, apfelmus -- http://apfelmus.nfshost.com

On Sun, Nov 1, 2009 at 12:31 PM, Heinrich Apfelmus
David Menendez wrote:
On Sun, Nov 1, 2009 at 7:12 AM, Heinrich Apfelmus
wrote: Even then, the results are mixed. The Church-encoding shines in GHCi as it should, but loses its advantage when the code is being compiled. I guess we have to look at the core if we want to know what exactly is going on.
What optimization level did you compile with?
No optimization.
Do you get the same results if you compile with -O2? Or does that
screw up criterion somehow?
--
Dave Menendez

David Menendez wrote:
Heinrich Apfelmus wrote:
David Menendez wrote:
Heinrich Apfelmus wrote:
Even then, the results are mixed. The Church-encoding shines in GHCi as it should, but loses its advantage when the code is being compiled. I guess we have to look at the core if we want to know what exactly is going on.
What optimization level did you compile with? No optimization.
Do you get the same results if you compile with -O2? Or does that screw up criterion somehow?
The results for the Church encodings are mostly the same, but now the algebraic data type is much faster. touch Test.hs; ghc --make Test.hs -o Test-O2 -O2 ./Test-O2 data Maybe mean: 23.99268 ms, lb 23.64836 ms, ub 24.43737 ms, ci 0.950 Church encoding mean: 146.0500 ms, lb 138.6242 ms, ub 154.0508 ms, ci 0.950 Church encoding optimised mean: 93.31060 ms, lb 90.34235 ms, ub 96.25145 ms, ci 0.950 Regards, apfelmus -- http://apfelmus.nfshost.com

On Tue, Oct 13, 2009 at 4:44 AM, Eugene Kirpichov
I took a toy problem - find the first node satisfying a predicate in a binary tree, started with a naive Maybe-based implementation - and experimented with 3 ways of changing the program: - Church-encode the Maybe - Convert the program into CPS - Defunctionalize the Church-encoded or CPS-transformed program http://hpaste.org/fastcgi/hpaste.fcgi/view?id=10686
The link points to code, a benchmark and conclusion.
Conclusion: - Haskell implements Maybe well enough that it is not possible to do better - Defunctionalization and consequent optimization yields same performance as the one with Maybe - Non-simplified CPS and Church-encoded representations do bad
You may find this collection of related papers interesting if you have not already seen them: http://lambda-the-ultimate.org/node/2423#comment-38384
participants (4)
-
David Menendez
-
Derek Elkins
-
Eugene Kirpichov
-
Heinrich Apfelmus