[GHC] #9013: addWord2# is buggy

#9013: addWord2# is buggy ------------------------------------+------------------------------------- Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Keywords: | Operating System: Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | ------------------------------------+------------------------------------- Looks like its behavior differs between optimization levels: {{{#!haskell {-# LANGUAGE MagicHash, UnboxedTuples #-} import GHC.Prim import GHC.Word big :: Word big = maxBound good = let x = case big of W# w -> w in case plusWord2# x x of (# a, b #) -> (W# a, W# b) main = do putStrLn $ case good of (0, z) -> "Such optimal, much sad" (1, z) -> "No optimization here" }}} I thought I was going crazy! Try the above code with and without -O2 and you'll get different output. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------ Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------ Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by pumpkin): I'm attempting to write a fast native Haskell multiprecision natural library, so those *2 variants of the primops are quite handy. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------ Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by pumpkin): Reid Barton tracked down the culprit on IRC: https://github.com/ghc/ghc/blob/master/compiler/nativeGen/X86/Ppr.hs#L553 So it looks like: 1. it's not specific to `plusWord2#` 2. it only happens on compile-time literals (so is admittedly pretty rare) 3. it's not always sound to switch an `add` to a `dec` because the latter doesn't affect the carry flag and the former does -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------ Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by simonpj): The offending code you point to is {{{ pprInstr (ADD size (OpImm (ImmInt (-1))) dst) = pprSizeOp (sLit "dec") size dst }}} Interestingly, this line dates back at least before 2005; see `7d61cb61` for example. It's surprising to me that * it's the ''pretty-printer'' that is doing some peephole-style optimisations * there is no case for optimising `SUB src 1` to `dec src`, or `SUB src (-1)`. Moreover, it's clearly too late in the pipeline, because by this point the difference between `plusWord#` and `plusWord2#` has disappeared. So fixing the problem by deleting these two lines would lose a perfectly good optimisation for `plusWord#`. Surely it'd be better for some earlier phase to optimise `(plusWord# x (-1))` to `(minusWord# x 1)`; but of course not to do so for `plusWord2#` since (as you say) the carry is affected differently. Moreover, the transformation would then be platform-independent, rather than per- platform. Although, now I think about it, the ''reason'' for doing the optimisation is to expose the possibility for subsequent, platform- specific `inc/dec` optimisations. Maybe that would be OK if documented. Or maybe we could simply do it in passage from Cmm to native code (when we know what primop is being used) rather than in the pretty printer? It looks to me as if `plusWord2#` is the primop `WordAdd2Op`, which in turn is translated to the `CallishMachOp` called `MO_Add2`, which in turn is translated (on X86) to `ADC`. So now I'm confused how the pretty- printer's optimisation of `ADD` affected this program. Would someone like to dig a little deeper and propose a fix (other than the sticking-plaster of deleting above two lines)? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------ Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by pumpkin): Thanks! It was a lucky (or unlucky) test case. Good point about the `ADD` vs. `ADC`, and I'm not sure. I don't see myself having much time in the near future to dig in but if nobody gets to it first in the next two or three weeks I'll take a look. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------ Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by rwbarton): Simon: `MO_Add2` is translated to an `ADD` (which is expected to either set or clear the carry flag as appropriate) followed by an `ADC`. Printing the `ADD` as (in this case) `dec` rather than `add` loses the effect on the carry flag, so the subsequent `ADC` will just use whatever garbage happens to be in the carry flag. Like Simon, I'm not totally thrilled with the organization of the code here, but one simple fix would be to add an `ADD'` instruction which means the same as `ADD` except it indicates that the side effect of setting the carry flag is important. Then the pretty printer can produce `dec` for an `ADD` of `-1` but not for an `ADD'` of `-1`. GHC doesn't expose the carry flag to Haskell directly so we would only need to use `ADD'` in the translation of `MO_Add2` and other similar operations. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy
-------------------------------------+------------------------------------
Reporter: pumpkin | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by Reid Barton

#9013: plusWord2# is buggy -------------------------------------+------------------------------------- Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by rwbarton): In fact it's even more involved than I originally thought. `MO_Add2` invokes the code generator for `MO_Add` to do the first addition. `MO_Add` sees that the second argument is constant so it can produce an `LEA` instruction (which wouldn't set the carry flag). The pretty-printer notices that the source and destination registers of the `LEA` instruction are the same, so it rewrites it to an `ADD` (which would set the carry flag); and then it notices that the addend is -1, so it finally emits `dec` (which doesn't set the carry flag). All of what the code generator does for `MO_Add` is fine, since the semantics of `MO_Add` don't include any effect (or lack thereof) on the condition register. The error is in the code generation for `MO_Add2`: it should not use `MO_Add`, for this reason. I will put up a patch for review shortly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------- Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: Phab:D137 | -------------------------------------+------------------------------------- Changes (by rwbarton): * differential: => Phab:D137 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy -------------------------------------+------------------------------------- Reporter: pumpkin | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: Phab:D137 | -------------------------------------+------------------------------------- Comment (by simonmar): I've accepted @rwbarton's diff.
Surely it'd be better for some earlier phase to optimise (plusWord# x (-1)) to (minusWord# x 1)
The problem is that this would only catch instructions that arise from those particular primops, not any other instructions. Add instructions arise for lots of reasons, even during native code generation itself. The reason the peephole is done very late is * We don't need explicit inc/dec instructions in the data type * We catch all the opportunities. Anything else would run the risk of any subsequent optimisation passes leaving opportunities behind. A smart constructor would work but also runs the risk that you might forget to use it somewhere. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9013: plusWord2# is buggy
-------------------------------------+-------------------------------------
Reporter: pumpkin | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.2
Resolution: | Keywords:
Operating System: | Architecture: Unknown/Multiple
Unknown/Multiple | Difficulty: Unknown
Type of failure: | Blocked By:
None/Unknown | Related Tickets:
Test Case: |
Blocking: |
Differential Revisions: Phab:D137 |
-------------------------------------+-------------------------------------
Comment (by Reid Barton

#9013: plusWord2# is buggy -------------------------------------+------------------------------------- Reporter: pumpkin | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: fixed | Keywords: Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Unknown Type of failure: | Blocked By: None/Unknown | Related Tickets: Test Case: | Blocking: | Differential Revisions: Phab:D137 | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9013#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC