Optimizations for list comprehension

19 Aug 2019

      Hello Cafe,

While investigating a performance problem I stumbled upon what I eventually
reduced to the example below:

module Main where

import Data.Time.Clock

outside :: Int -> Int
outside n =
  sum [i + j | i <- range, j <- range]
  where
    range = [0..n-1]

inside :: Int -> Int
inside n =
  sum [i + j | i <- [0..n-1], j <- [0..n-1]]

main :: IO ()
main = do
  t0 <- getCurrentTime
  print $ inside 10000
  t1 <- getCurrentTime
  print $ outside 10000
  t2 <-getCurrentTime

  print (diffUTCTime t1 t0)
  print (diffUTCTime t2 t1)

Compiling with -O2, up to GHC 8.2.2, both `inside` and `outside` functions
would take the same amount of time to execute. Somewhere between GHC 8.2.2
and 8.6.4 something changed (possibly some new optimization) making
`inside` run ~4x faster on my machine. With LLVM the difference is even
bigger.

It is not that `outside` got slower, but that `inside` got much faster. I'm
curious to what optimizations might be happening to the `inside` function
that would not fire on the outside function.

Any hints?

Best regards,

Emilio

Emilio Francesquini

William Yager

Emilio Francesquini

Niklas Hambüchen

Claude Heiland-Allen

Simon Peyton Jones

tags

participants (5)