Hello Cafe,
While investigating a performance problem I stumbled upon what I eventually reduced to the example below:
module Main where
import Data.Time.Clock
outside :: Int -> Int
outside n =
  sum [i + j | i <- range, j <- range]
  where
    range = [0..n-1]
inside :: Int -> Int
inside n =
  sum [i + j | i <- [0..n-1], j <- [0..n-1]]
main :: IO ()
main = do
  t0 <- getCurrentTime
  print $ inside 10000
  t1 <- getCurrentTime
  print $ outside 10000
  t2 <-getCurrentTime
  print (diffUTCTime t1 t0)
  print (diffUTCTime t2 t1)
Compiling with -O2, up to GHC 8.2.2, both `inside` and `outside` functions would take the same amount of time to execute. Somewhere between GHC 8.2.2 and 8.6.4 something changed (possibly some new optimization) making `inside` run ~4x faster on my machine. With LLVM the difference is even bigger.
It is not that `outside` got slower, but that `inside` got much faster. I'm curious to what optimizations might be happening to the `inside` function that would not fire on the outside function.
Any hints?
Best regards,
Emilio