
#16040: Unboxing-Related Performance Issue with Polymorphic Functions -------------------------------------+------------------------------------- Reporter: _recursion | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.6.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Old description:
My team has observed a 2x performance degradation in code that makes use of `StateT` that appears to be related to strictness and unboxing, even when built with `-O2`. Our code makes heavy use of the state monad, and when GHC fails to optimise this performance
We've managed to minimise the behaviour to the following reproducer (that ignores state entirely), consisting of two files. It depends on `criterion`
`Lib.hs`:
{{{#!hs {-# LANGUAGE BangPatterns #-}
module Lib where
-- A type to take the place of state data X a = X { runX :: !a }
test1 :: Int -> Int test1 = \(!i) -> go i where go = \(!i) -> if i > 0 then go $! i - 1 else i {-# NOINLINE test1 #-}
test2 :: Int -> Int test2 = \(!i) -> runX (go i) where go = \(!i) -> if i > 0 then go $! i - 1 else X i {-# NOINLINE test2 #-} }}}
`Main.hs`:
{{{#!hs {-# LANGUAGE Strict #-} module Main where
import Lib import Criterion import Criterion.Main
main :: IO () main = defaultMain [ bgroup "main" [ bgroup "small" [ bench "without state" $ whnf test1 100000000 , bench "with state" $ whnf test2 100000000 ] ] ] }}}
Run as above, the code takes twice as long to execute `test2` as it does `test1`. However, when the signature for `runX` is changed to `runX :: !Int`, the code in `test2` exhibits identical performance to `test1`.
It has been reproduced across multiple linux64 machines, but not tested on any other architecture or operating system.
Please find the full (stack - yes, I know) project as an attachment. You can simply `stack run` to observe the issue. If you have any further queries please let me know.
New description: My team has observed a 2x performance degradation in code that makes use of `StateT` that appears to be related to strictness and unboxing, even when built with `-O2`. Our code makes heavy use of the state monad, and when GHC fails to optimise this usage, the performance becomes untenably slow. We've managed to minimise the behaviour to the following reproducer (that ignores state entirely), consisting of two files. It depends on `criterion` `Lib.hs`: {{{#!hs {-# LANGUAGE BangPatterns #-} module Lib where -- A type to take the place of state data X a = X { runX :: !a } test1 :: Int -> Int test1 = \(!i) -> go i where go = \(!i) -> if i > 0 then go $! i - 1 else i {-# NOINLINE test1 #-} test2 :: Int -> Int test2 = \(!i) -> runX (go i) where go = \(!i) -> if i > 0 then go $! i - 1 else X i {-# NOINLINE test2 #-} }}} `Main.hs`: {{{#!hs {-# LANGUAGE Strict #-} module Main where import Lib import Criterion import Criterion.Main main :: IO () main = defaultMain [ bgroup "main" [ bgroup "small" [ bench "without state" $ whnf test1 100000000 , bench "with state" $ whnf test2 100000000 ] ] ] }}} Run as above, the code takes twice as long to execute `test2` as it does `test1`. However, when the signature for `runX` is changed to `runX :: !Int`, the code in `test2` exhibits identical performance to `test1`. It has been reproduced across multiple linux64 machines, but not tested on any other architecture or operating system. Please find the full (stack - yes, I know) project as an attachment. You can simply `stack run` to observe the issue. If you have any further queries please let me know. -- Comment (by _recursion): Unfinished sentence in the report because I am sometimes unobservant. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/16040#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler