[GHC] #13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1-rc2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Github user jmoy documents an issue with lack of specialization happening for INLINABLE functions in GHC 8.0 here: https://github.com/jmoy/testing- specialize . Testing on 8.2.1 it seems that specialization happened (although I didn't verify this in the core) as the result was 10x faster. But the odd thing to me was uncommenting the `SPECIALIZE` pragma at the callsite actually resulted in a significant regression: https://github.com/jmoy/testing-specialize/issues/1#issuecomment-310868360 Maybe GHC is choosing the worse manual partial specialization for some reason. I'm sorry I can't produce a better ticket for this at the moment. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * failure: None/Unknown => Runtime performance bug * milestone: => 8.2.2 Comment: Quite suspicious indeed. I'll have a look. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => Specialise Comment: As I understand it, you are saying that * adding a SPECIALISE pragma with 8.2 made the the code runs 2x slower Whereas adding the same pragma with 8.0 made the code run 3x faster. (But still not as fast as the slowest version with 8.2.) But it's the bulleted point that is mysterious. Ben please do investigate. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): I looked at this. Firstly, 8.2.1 is much faster than 8.0.2. So I looked at the core for both versions. The key function is `wcJH` which in 8.0.2 calls `$winsertWith` with a dictionary argument `$fPrimMonadST`. In 8.2.1, this function does get specialised (see `$s$winsertWith`) which accounts for the difference. When the specialisation happens, the type of the specialised function is {{{#!hs $w$sinsertWith :: MutVar# (PrimState (ST RealWorld)) (HashTable_ (PrimState (ST RealWorld)) ByteString v) -> v -> v -> v -> ByteString -> v -> State# RealWorld -> (# State# RealWorld, () #) }}} The specialisation produced by the specialise pragma is instead, {{{#!hs $w$sinsertWith :: MutVar# (PrimState (ST s)) (HashTable_ (PrimState (ST s)) ByteString v) -> v -> v -> v -> ByteString -> v -> State# s -> (# State# s, () #) }}} as such the `s` parameter is not specialised to `RealWorld`. Changing the specialise pragma to {{{#!hs {-# SPECIALIZE J.insertWith::(Hashable k,Eq k) => J.HashTable RealWorld k v -> (v->v->v) -> k -> v -> ST RealWorld () #-} }}} makes the performance the same. Digging much deeper, deep in the definition of the specialised insertWith function there is a call to `checkResize`. Without the pragma, this is specialised but with the pragma, it is not specialised and passed three dictionary arguments. Fixing `s` in the specialise pragma to `RealWorld`. also causes `checkResize` to be specialised which is why the performance improves. So it seems that the specialisation is not as good with the manual pragma as GHC is specialising more than the pragma in order to also remove a type argument. This means that other functions called by `insertWith` can also be specialised, if we do not specialise on `s` as well then their type will mention this type argument `s` which means that calls to this function are removed by `dumpBindUDs` at the top level. GHC itself will not remove the type argument `s` as we only specialise on dictionary arguments - it does seem like we might be able to do better here. If we were to further specialise `insertWith` in order to also remove the type argument then it would cause `checkResize` to specialise as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jberryman): Thanks so much for looking into this. I maybe wasn't clear in the body that the reason I filed the ticket was I assumed that GHC would try to choose the best specialization from those in scope (since my intuition was specialize pragmas behave sort of like type classes in the way they leak via `import` statements) and from whatever specializations it might perform in context. But maybe this is expected behavior, and a manual specialization will always take precedence? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I don't have bandwidth to reproduce right now, but I think Matthew is saying we have {{{ f :: forall s. State# s -> blah f = /\s. \(x::State# s). ....(g @s d1 d2 d3).... }}} where `g` is an overloaded function, called three dictionary arguments. But note that `f` is not overloaded, and so will not be auto-specialised. Somehow, if we specialised `f`, we could specialise `g`. I don't quite see how this happens. Can you fill in a bit more detail about what `g` is, and what dictionaries `d1`-`d3` are? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): That's right Simon. How this happens is that the `SPECIALISE` pragma creates a rule which looks like {{{ "SPEC insertWith" forall (@ k) (@ s) (@ v) ($dHashable :: Hashable k) ($dEq :: Eq k) ($dPrimMonad :: PrimMonad (ST s)). insertWith @ k @ (ST s) @ v $dHashable $dEq $dPrimMonad = $sinsertWith @ k @ s @ v $dHashable $dEq }}} which fires at quite an early stage in optimisation. Then any function in `$sinsertWith` which mentions `s` won't be specialised as it is lambda bound. If you remove the pragma you just get one specialisation rule which probably arises after a lot of inlining has happened. {{{ "SPEC/Main insert @ ByteString @ (ST RealWorld) _" forall (@ v) ($dPrimMonad :: PrimMonad (ST RealWorld)) ($dEq :: Eq ByteString) ($dHashable :: Hashable ByteString). insert @ ByteString @ (ST RealWorld) @ v $dHashable $dEq $dPrimMonad = $scheckResize_$sinsert @ v }}} and if you fix `s = RealWorld` with the specialisation pragma then both occur. {{{ "SPEC insertWith" forall (@ k) (@ v) ($dHashable :: Hashable k) ($dEq :: Eq k) ($dPrimMonad :: PrimMonad (ST RealWorld)). insertWith @ k @ (ST RealWorld) @ v $dHashable $dEq $dPrimMonad = $sinsertWith @ k @ v $dHashable $dEq "SPEC/Main insert @ ByteString @ (ST RealWorld) _" forall (@ v) ($dPrimMonad :: PrimMonad (ST RealWorld)) ($dEq :: Eq ByteString) ($dHashable :: Hashable ByteString). insert @ ByteString @ (ST RealWorld) @ v $dHashable $dEq $dPrimMonad = $scheckResize_$sinsert @ v }}} Seeing as you also ask for the type of `g`, it is defined in another module as is marked `INLINABLE`. {{{ checkResize::(Hashable k,Eq k,PrimMonad m) => HashTable_ (PrimState m) k v -> m (Maybe (HashTable (PrimState m) k v)) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Thanks so much for looking into this. I maybe wasn't clear in the body
#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): Replying to [comment:4 jberryman]: that the reason I filed the ticket was I assumed that GHC would try to choose the best specialization from those in scope (since my intuition was specialize pragmas behave sort of like type classes in the way they leak via `import` statements) and from whatever specializations it might perform in context.
But maybe this is expected behavior, and a manual specialization will
always take precedence? The specialiser doesn't choose which specialisation to apply, it merely looks for specialisation opportunities and then creates a new definition along with a RULE which does the replacement. If you have conflicting specialisations then the choice about which one applies is left up to he rule selection mechanism which I am not as familiar with. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by crockeea): * cc: crockeea (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.3 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.2.2 => 8.2.3 Comment:
If you have conflicting specialisations then the choice about which one applies is left up to the rule selection mechanism which I am not as familiar with.
Unfortunately this pretty much means that the choice comes down to chance (e.g. which rule happens to fire first). Regardless, this won't be fixed for 8.2.2. Bumping to 8.2.3. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC