
#13001: EnumFromThenTo is is not a good producer -------------------------------------+------------------------------------- Reporter: George | Owner: Type: bug | Status: new Priority: low | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): It's true that doing more inlining will generally improve things, but we could be a bit more forensic about this. Just sprinkling more inline pragmas, some of which may do no good, is a bit of a blunt instrument. I checked the code for `testFromThenTo`, compiled with -O, and got {{{ Foo.$wtestFromThenTo = \ (ww_s2vE :: GHC.Prim.Int#) -> case GHC.Prim.tagToEnum# @ Bool (GHC.Prim.<# ww_s2vE 0#) of { False -> case ww_s2vE of wild2_a2uk { __DEFAULT -> case GHC.Real.$wf1 10# wild2_a2uk of ww4_a2uE { __DEFAULT -> GHC.Enum.efdtIntUpFB @ (Int -> Int) (GHC.List.lengthFB @ Int) (id @ Int) 0# 2# ww4_a2uE Foo.testFromThenTo2 }; 0# -> Foo.testFromThenTo1 }; True -> GHC.Real.^2 } }}} Fusion has happened (there is a use of the `foldr/build` rule), but the call to `edftIntUpFB` remains. That's not necessarily bad. As you'll see, it's not a tiny function: {{{ efdtIntUpFB :: (Int -> r -> r) -> r -> Int# -> Int# -> Int# -> r efdtIntUpFB c n x1 x2 y -- Be careful about overflow! | isTrue# (y <# x2) = if isTrue# (y <# x1) then n else I# x1 `c` n | otherwise = -- Common case: x1 <= x2 <= y let !delta = x2 -# x1 -- >= 0 !y' = y -# delta -- x1 <= y' <= y; hence y' is representable -- Invariant: x <= y -- Note that: z <= y' => z + delta won't overflow -- so we are guaranteed not to overflow if/when we recurse go_up x | isTrue# (x ># y') = I# x `c` n | otherwise = I# x `c` go_up (x +# delta) in I# x1 `c` go_up x2 }}} BUT the bad thing is that it allocated loads of `Int` values! Why does it do that? Becuase the function passed to it (`c` in the above definition) takes an `Int` as its argument. So if `efdtIntUpFB` isn't inlined, it ''must'' allocate an `Int` box for every iteration. Bad! But this is an internal function. Suppose we gave it this signature: {{{ efdtIntUpFB :: (Int# -> r -> r) -> r -> Int# -> Int# -> Int# -> r -- ^^^ NB Int# not Int }}} Now it won't allocate! Can we make the rest of the pieces fit together now? Would you like to have a try? I also looked at the call to `lengthFB` in the above optimised code. It's defined like this: {{{ lengthFB :: x -> (Int -> Int) -> Int -> Int lengthFB _ r = \ !a -> r (a + 1) }}} Uh-ho! More `Int` boxes! So I tried rewriting the `length` moving parts (in `GHC.List`) like this: {{{ {-# INLINE xlength #-} xlength :: [a] -> Int xlength xs = I# (xlenAcc xs 0#) xlenAcc :: [a] -> Int# -> Int# xlenAcc [] n = n xlenAcc (_:ys) n = xlenAcc ys (n +# 1#) {-# RULES "xlength" [~1] forall xs . xlenAcc xs = foldr xlengthFB idXlength xs "xlengthList" [1] foldr xlengthFB idXlength = xlenAcc #-} -- The lambda form turns out to be necessary to make this inline -- when we need it to and give good performance. {-# INLINE [0] xlengthFB #-} xlengthFB :: x -> (Int# -> Int#) -> Int# -> Int# xlengthFB _ r = \ a -> r (a +# 1#) {-# INLINE [0] idXlength #-} idXlength :: Int# -> Int# idXlength x = x }}} That compiles fine. Even if it generates the same code as before, GHC will have to do less work to optimise it, so it's a win. Would you like to try that change to `length` and see if it is an improvement? Maybe you can do the same for `efdtIntUpFB`? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13001#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler