Hello Cafe,

While investigating a performance problem I stumbled upon what I eventually reduced to the example below:

module Main where

import Data.Time.Clock

outside :: Int -> Int
outside n =
  sum [i + j | i <- range, j <- range]
  where
    range = [0..n-1]

inside :: Int -> Int
inside n =
  sum [i + j | i <- [0..n-1], j <- [0..n-1]]

main :: IO ()
main = do
  t0 <- getCurrentTime
  print $ inside 10000
  t1 <- getCurrentTime
  print $ outside 10000
  t2 <-getCurrentTime

  print (diffUTCTime t1 t0)
  print (diffUTCTime t2 t1)


Compiling with -O2, up to GHC 8.2.2, both `inside` and `outside` functions would take the same amount of time to execute. Somewhere between GHC 8.2.2 and 8.6.4 something changed (possibly some new optimization) making `inside` run ~4x faster on my machine. With LLVM the difference is even bigger.

It is not that `outside` got slower, but that `inside` got much faster. I'm curious to what optimizations might be happening to the `inside` function that would not fire on the outside function.

Any hints?

Best regards,

Emilio