[GHC] #13193: Integer (gmp) performance regression?

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Linux Architecture: x86_64 | Type of failure: Runtime (amd64) | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- For a simple program that uses Integer, but actual data would fit in an Int, ghc-8.0.2 seems to have more runtime overhead than ghc-6.12.3 had: http://www.imn.htwk-leipzig.de/~waldmann/etc/mob/ I am not really sure what to make of it - an answer could be: "if you know that your Integer fits in an Int, then just use Int". -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): That's odd * Can you explain how to reproduce the performance change you see? * Can you say how big the perf change is? * Can you say why you think it's tied up with "does your Integer fit into an Int"? Thanks -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by j.waldmann): Hi. * how to reproduce: add the type declaration that prevents the defaulting {{{ a :: Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int -> ST s Int` }}} * how big the change is {{{ ghc-6.12.3: Int: 1.28 sec Integer: 1.56 sec ghc-8.0.2 : Int: 1.30 sec Integer: 2.30 sec }}} * "Integer fit in Int" - when I replace Integer by Int, the program produces the exact same result, so I assume there were no overflows. When I replace the numerals 1, -1 by something really large (+/- 1000000000000000000), the execution path should be identical (affine transformation). The Int result does no longer agree with the Integer result, and the Integer runtime goes up slightly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): So to paraphase: the difference between Int and Integer perf for 6.10 is smaller than 8.0. Thanks. Does anyone feel able to take a few minutes to dig into why this is the case? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by mpickering): * keywords: => newcomer -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): The allocations go up between 7.8 and 7.10, so I'm sure it has to do with the new integer-gmp library in 7.10. Will take a look. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Yes, the additional allocations do have to do with the new integer-gmp, though they aren't entirely the integer-gmp library's fault. The additional allocations occur in subtraction (in `pred`). {{{ -- | Subtract one 'Integer' from another. minusInteger :: Integer -> Integer -> Integer minusInteger x (S# 0#) = x minusInteger (S# 0#) (S# INT_MINBOUND#) = Jp# (wordToBigNat ABS_INT_MINBOUND##) minusInteger (S# 0#) (S# y#) = S# (negateInt# y#) minusInteger (S# x#) (S# y#) = case subIntC# x# y# of (# z#, 0# #) -> S# z# (# 0#, _ #) -> Jn# (wordToBigNat2 1## 0##) (# z#, _ #) | isTrue# (z# ># 0#) -> Jn# (wordToBigNat ( (int2Word# (negateInt# z#)))) | True -> Jp# (wordToBigNat ( (int2Word# z#))) -- more cases, that aren't (S# _) (S# _) }}} Based on looking at the assembly, this seems to have been compiled in 7.10.1 to something like {{{ minusInteger x y = case y of { S# y# -> case y# of { 0# -> ... INT_MINBOUND# -> ... __DEFAULT__ -> let minusY = S# (negateInt# y#) in case x of { S# x# -> ... Jp# {} -> ... Jn# {} -> ... } } Jp# {} -> ... Jn# {} -> ... } }}} Note that we allocate `S# (negateInt# y#)` even though we're unlikely to actually need it, and will never need it more than once. I see two issues here: * There are many special cases in `minusInteger`, that are almost never helpful, and harmful in the common case of small integers (i.e., `S#`). The old code which began {{{ minusInteger (S# i) (S# j) = case subIntC# i j of ... }}} was much better. But, by itself this shouldn't cause additional ''allocations'', just additional work in the form of conditionals and branches. * GHC floated out `S# (negateInt# y#)`, which was a bad choice: increased allocations, and never any gain, as far as I can see. Aha, now that I look at the rest of `minusInteger`, `(negateInt# y#)` does appear again in some branches like {{{ minusInteger (Jp# x) (S# y#) | isTrue# (y# >=# 0#) = bigNatToInteger (minusBigNatWord x (int2Word# y#)) | True = Jp# (plusBigNatWord x (int2Word# (negateInt# y#))) }}} But it's never used more than once per branch, and the `S#` box that we allocated is still not used. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Aha! the source code I was looking at changed since the version of the object code I was examining. So disregard the second point above. Phab:D2278 is new in HEAD; the old implementation was basically `x - y = x + (-y)`, which has to allocate `-y`. I think the code could still be made somewhat shorter/better, but that would be a relatively minor improvement. j.waldmann, could you test with a perf build of HEAD? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Since I spent all that time looking at the wrong version of `minusInteger` anyways, I went ahead and simlified it slightly: Phab:D3034. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Thanks for working on this. If you think that HEAD is compiling code that is sub-optimal, can you boil it out to an example? But if it is just that the original Haskell forces GHC to generate bad code, then yes let's just improve the implementation of `minusInteger`. I couldn't quite figure out which is the case from the notes above. I did try to compile the code from comment:6. For the record, here's what I compiled; and it seemed to give good code (no allocation) {{{ {-# LANGUAGE MagicHash, UnboxedTuples #-} module Foo where import Prelude () import GHC.Classes import GHC.Magic import GHC.Prim import GHC.Types #if WORD_SIZE_IN_BITS < 64 import GHC.IntWord64 #endif #define INT_MINBOUND -300 #define ABS_INT_MINBOUND 300 wordToBigNat :: Word# -> BigNat wordToBigNat x = 7 wordToBigNat2 :: Word# -> Word# -> BigNat wordToBigNat2 _ lw# = 3 data Integer = S# !Int# -- ^ iff value in @[minBound::'Int', maxBound::'Int']@ range | Jp# {-# UNPACK #-} !BigNat -- ^ iff value in @]maxBound::'Int', +inf[@ range | Jn# {-# UNPACK #-} !BigNat -- ^ iff value in @]-inf, minBound::'Int'[@ range type BigNat = Int -- | Subtract one 'Integer' from another. minusInteger :: Integer -> Integer -> Integer minusInteger x (S# 0#) = x minusInteger (S# 0#) (S# INT_MINBOUND#) = Jp# (wordToBigNat ABS_INT_MINBOUND##) minusInteger (S# 0#) (S# y#) = S# (negateInt# y#) minusInteger (S# x#) (S# y#) = case subIntC# x# y# of (# z#, 0# #) -> S# z# (# 0#, _ #) -> Jn# (wordToBigNat2 1## 0##) (# z#, _ #) | isTrue# (z# ># 0#) -> Jn# (wordToBigNat ( (int2Word# (negateInt# z#)))) | True -> Jp# (wordToBigNat ( (int2Word# z#))) -- more cases, that aren't (S# _) (S# _) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Thanks for working on this. If you think that HEAD is compiling code
#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Replying to [comment:9 simonpj]: that is sub-optimal, can you boil it out to an example? But if it is just that the original Haskell forces GHC to generate bad code, then yes let's just improve the implementation of `minusInteger`.
I couldn't quite figure out which is the case from the notes above.
Please disregard the second bullet point of comment:6. I was comparing compiled code from 7.10 with the source code from HEAD, but there was a change that I didn't know about in the source between those versions. My hope is that this change (Phab:D2278 from this past May) will bring HEAD's performance back in line with 7.8's, but I don't have a suitable setup to test that at the moment. If it does so then we can close this ticket (optionally: after landing Phab:D3034, improving the implementation of `minusInteger` slightly). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by j.waldmann):
test with perf build of HEAD
{{{ ghc-8.1.20170128 : 2.14 sec }}} so this is roughly the performance of 7.10, better than 8.0, but still not 7.8 or earlier. I also updated the data at http://www.imn.htwk-leipzig.de/~waldmann/etc/mob/ -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Interesting. Looks like your test was before Phab:D3034 landed, so maybe that will help some. There was another difference unrelated to integer-gmp when I compared the Core for 7.8 and 7.10, with something getting marked as one-shot in 7.10. I didn't investigate that closely (and I also found it very hard to understand what this benchmark program does). I may take another look later, but unlikely to be in the immediate future. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13193: Integer (gmp) performance regression? -------------------------------------+------------------------------------- Reporter: j.waldmann | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: x86_64 Type of failure: Runtime | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by sjakobi): * cc: sjakobi (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13193#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC