Re: [Haskell-cafe] GHC magic optimization ?

5 Dec 2009


      Fixing my errors:
...
x = sum [1..10^6] + product [1..10^6]
x' = let l = [1..10^6] in sum l + product l
-- Define:
bar m n = foo (enumFromTo m n)
foo xs  = sum xs + prod xs

-- We're given:
sum = foldl (+) 0
product = foldl (*) 1
foldl f z xs =
  case xs of
    [] -> []
    x:xs -> foldl f (f z x) xs
enumFromTo m n =
  case n < m of
    True -> []
    False -> m : enumFromTo (m+1) n

-- The fused loop becomes:
foo xs = go0 0 1 xs
  where go0 a b xs =
          case xs of
            [] -> a+b
            x:xs -> go0 (a+x) (b*x) xs

-- Now inline foo in bar:
bar m n = go2 0 1 m n
  where go2 a b m n = go0 a b (go1 m n)
        go0 a b xs =
          case xs of
            [] -> a+b
            x:xs -> go0 (a+x) (b*x) xs
        go1 m n =
          case m < n of
            True -> []
            False -> m : go1 (m+1) n

-- considering go2
go2 a b m n = go0 a b (go1 m n)

    ==> case (go1 m n) of
          [] -> a+b
           x:xs -> go0 (a+x) (b*x) xs

    ==> case (case n < m of
               True -> []
               False -> m : go1 (m+1) n) of
          [] -> a+b
          x:xs -> go0 (a+x) (b*x) xs

    ==> case n < m of
          True -> case [] of
                    [] -> a+b
                    x:xs -> go0 (a+x) (b*x) xs

          False -> case (m : go1 (m+1) n) of
                     [] -> a+b
                     x:xs -> go0 (a+x) (b*x) xs

    ==> case n < m of
          True -> a+b
          False -> go0 (a+m) (b*m) (go1 (m+1) n)

-- So,
go2 a b m n =
  case n < m of
    True -> a+b
    False -> go0 (a+m) (b*m) (go1 (m+1) n)

-- And by the original def of go2
go2 a b m n = go0 a b (go1 m n)

-- We get
go2 a b m n =
  case m < n of
    True -> a+b
    False -> go2 (a+m) (b*m) (m+1) n

-- go0 and go1 and now dead in bar
bar m n = go2 0 1 m n
  where go2 a b m n =
          case n < m of
            True -> a+b
            False -> go2 (a+m) (b*m) (m+1) n

-- (furthermore, if (+) here is for Int/Double etc,
-- we can reduce go2 further to operate on machine
-- ints/doubles and be a register-only non-allocating loop)

-- So now finally returning to our original code:
...
x = sum [1..10^6] + product [1..10^6]
x' = let l = [1..10^6] in sum l + product l
-- We get:
x' = bar 1 (10^6)

Matt


On 12/4/09, Matt Morrow  wrote:
...
Although, in Luke's example,
...
x = sum [1..10^6] + product [1..10^6]
x' = let l = [1..10^6] in sum l + product l
We can do much much better, if we're sufficiently smart.
-- Define:
bar m n = foo (enumFromTo m n)
foo xs  = sum xs + prod xs
-- We're given:
sum = foldl (+) 0
product = foldl (*) 1
foldl f z xs =
  case xs of
    [] -> []
    x:xs -> foldl f (f z x) xs
enumFromTo m n =
  case m < n of
    True -> []
    False -> m : enumFromTo (m+1) n
-- The fused loop becomes:
foo xs = go0 0 1 xs
  where go0 a b xs =
          case xs of
            [] -> a+b
            x:xs -> go0 (a+x) (b*x) xs
-- Now inline foo in bar:
bar m n = go2 0 1 m n
  where go2 = go0 a b (go1 m n)
        go0 a b xs =
          case xs of
            [] -> a+b
            x:xs -> go0 (a+x) (b*x) xs
        go1 m n =
          case m < n of
            True -> []
            False -> m : go1 (m+1) n
-- considering go2
go2 = go0 a b (go1 m n)
==> case (go1 m n) of
          [] -> a+b
           x:xs -> go0 (a+x) (b*x) xs
==> case (case m < n of
               True -> []
               False -> m : go1 (m+1) n) of
          [] -> a+b
          x:xs -> go0 (a+x) (b*x) xs
==> case m < n of
          True -> case [] of
                    [] -> a+b
                    x:xs -> go0 (a+x) (b*x) xs
False -> case (m : go1 (m+1) n) of
                     [] -> a+b
                     x:xs -> go0 (a+x) (b*x) xs
==> case m < n of
          True -> a+b
          False -> go0 (a+m) (b*m) (go1 (m+1) n)
-- So,
go2 = case m < n of
        True -> a+b
        False -> go0 (a+m) (b*m) (go1 (m+1) n)
-- And by the original def of go2
go2 = go0 a b (go1 m n)
-- We get
go2 = case m < n of
        True -> a+b
        False -> go2 (a+m) (b*m) (m+1) n
-- go0 and go1 and now dead in bar
bar m n = go2 0 1 m n
  where go2 = case m < n of
                True -> a+b
                False -> go2 (a+m) (b*m) (m+1) n
-- (furthermore, if (+) here is for Int/Double etc,
-- we can reduce go2 further to operate on machine
-- ints/doubles and be a register-only non-allocating loop)
-- So now finally returning to our original code:
...
x = sum [1..10^6] + product [1..10^6]
x' = let l = [1..10^6] in sum l + product l
-- We get:
x' = bar 1 (10^6)
And the intermediate list never exists at all.
Matt
On 12/4/09, Luke Palmer  wrote:
...
On Fri, Dec 4, 2009 at 3:36 AM, Neil Brown  wrote:
...
But let's say you have:
g x y = f x y * f x y
Now the compiler (i.e. at compile-time) can do some magic.  It can spot
the
common expression and know the result of f x y must be the same both
times,
so it can convert to:
g x y = let z = f x y in z * z
GHC does *not* do this by default, quite intentionally, even when
optimizations are enabled.  The reason is because it can cause major
changes in the space complexity of a program.  Eg.
x = sum [1..10^6] + product [1..10^6]
x' = let l = [1..10^6] in sum l + product l
x runs in constant space, but x' keeps the whole list in memory.  The
CSE here has actually wasted both time and space, since it is harder
to save [1..10^6] than to recompute it!  (Memory vs. arithmetic ops)
So GHC leaves it to the user to specify sharing.  If you want an
expression shared, let bind it and reuse.
Luke
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe