Does/How does GHC/Haskell optimize this ('almost identical associative expressions', pattern matching reordering)

19 Dec 2014

      Hi all,

a) I've learned that Haskell uses graphs (graph reduction) to do
optimizations / 'sharing'. Calculation example:

--    (1 + 2) + (1 + 2)
-- ~> 3 + 3
-- ~> 6

in contrast to

--    (1 + 2) + (3 + 0)
-- ~> 3 + (3 + 0)
-- ~> 3 + 3
-- ~> 6

What does it do in a case like:

... min 0 1 ... min 1 0 ...,

or: (1 + 2) + (2 + 1)

where a function definition is used which has the same result
in both cases, but arguments are flipped. This is a simple example
for a very general problem.
What does the underlying system do about it?
(a1) Can it detect and optimize it? In the example it would be easy to 
reduce
it to the relations that define the min function. Is GHC able to detect
it? (a2) Is there a super language to Haskell, like an speci. assembler in
other paradigms, that could split code into even smaller pieces?
(a3) What happens at run-time, for example: ... min a b ... min b a ...?

b) About 'pattern matching': Does Haskell reorder
patterns for optimization? How/what technique does it use there? Example:

...
f [] = ...
f (a : as) = ... f ( b(a) ) ...
...

VS.

...
f (a : as) = ... f ( b(a) ) ...
f [] = ...
...

If Haskell does pattern matching like I've been told: top down, the first
definition would statistically be inefficient, because recursive 
functions are
normally called with the intension to do some (recursive) calculations. 
I tested (*)
it (without optimization). What do I need to know to understand this 
behavior?

* Tests were (+RTS -<test> -RTS):

-------------------- snip --------------------
module Test where

f1 :: String -> String
f1 [] = []
f1 [a1] = [a1]
f1 [a1,a2] = [a1,a2]
f1 [a1,a2,a3] = [a1,a2,a3]
f1 [a1,a2,a3,a4] = [a1,a2,a3,a4]
f1 [a1,a2,a3,a4,a5] = [a1,a2,a3,a4,a5]
f1 [a1,a2,a3,a4,a5,a6] = [a1,a2,a3,a4,a5,a6]
f1 (a1 : a2 : a3 : a4 : a5 : a6 : a7 : as) = f1 as

f2 :: String -> String
f2 (a1 : a2 : a3 : a4 : a5 : a6 : a7 : as) = f2 as
f2 [a1,a2,a3,a4,a5,a6] = [a1,a2,a3,a4,a5,a6]
f2 [a1,a2,a3,a4,a5] = [a1,a2,a3,a4,a5]
f2 [a1,a2,a3,a4] = [a1,a2,a3,a4]
f2 [a1,a2,a3] = [a1,a2,a3]
f2 [a1,a2] = [a1,a2]
f2 [a1] = [a1]
f2 [] = []

p1 :: IO ()
p1 =  print $ f1 [' ' | _ <-[1 .. 10^9] ]   -- 63.87 secs

p2 :: IO ()
p2 =  print $ f2 [' ' | _ <-[1 .. 10^9] ]   -- 62.68 secs
-------------------- snip --------------------

Pascal Knodel

tags

participants (1)