
I'm not sure which exact optimizations are responsible, but based on
--ddump-simple,
* "inside" is not allocating any lists at all. It's just a couple loops
over unboxed ints
* "outside" is actually allocating a (single) list data structure and has
an inner loop and an outer loop, both of which traverse the list
GHC seems to be too aggressive about sharing "range" in "outside". Adding a
unit argument to "range" makes both functions go fast.
On Mon, Aug 19, 2019 at 8:14 PM Emilio Francesquini
Hello Cafe,
While investigating a performance problem I stumbled upon what I eventually reduced to the example below:
module Main where
import Data.Time.Clock
outside :: Int -> Int outside n = sum [i + j | i <- range, j <- range] where range = [0..n-1]
inside :: Int -> Int inside n = sum [i + j | i <- [0..n-1], j <- [0..n-1]]
main :: IO () main = do t0 <- getCurrentTime print $ inside 10000 t1 <- getCurrentTime print $ outside 10000 t2 <-getCurrentTime
print (diffUTCTime t1 t0) print (diffUTCTime t2 t1)
Compiling with -O2, up to GHC 8.2.2, both `inside` and `outside` functions would take the same amount of time to execute. Somewhere between GHC 8.2.2 and 8.6.4 something changed (possibly some new optimization) making `inside` run ~4x faster on my machine. With LLVM the difference is even bigger.
It is not that `outside` got slower, but that `inside` got much faster. I'm curious to what optimizations might be happening to the `inside` function that would not fire on the outside function.
Any hints?
Best regards,
Emilio
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.