
#13873: Adding a SPECIALIZE at a callsite in Main.hs is causing a regression -------------------------------------+------------------------------------- Reporter: jberryman | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.2.2 Component: Compiler | Version: 8.2.1-rc2 Resolution: | Keywords: Specialise Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by mpickering): I looked at this. Firstly, 8.2.1 is much faster than 8.0.2. So I looked at the core for both versions. The key function is `wcJH` which in 8.0.2 calls `$winsertWith` with a dictionary argument `$fPrimMonadST`. In 8.2.1, this function does get specialised (see `$s$winsertWith`) which accounts for the difference. When the specialisation happens, the type of the specialised function is {{{#!hs $w$sinsertWith :: MutVar# (PrimState (ST RealWorld)) (HashTable_ (PrimState (ST RealWorld)) ByteString v) -> v -> v -> v -> ByteString -> v -> State# RealWorld -> (# State# RealWorld, () #) }}} The specialisation produced by the specialise pragma is instead, {{{#!hs $w$sinsertWith :: MutVar# (PrimState (ST s)) (HashTable_ (PrimState (ST s)) ByteString v) -> v -> v -> v -> ByteString -> v -> State# s -> (# State# s, () #) }}} as such the `s` parameter is not specialised to `RealWorld`. Changing the specialise pragma to {{{#!hs {-# SPECIALIZE J.insertWith::(Hashable k,Eq k) => J.HashTable RealWorld k v -> (v->v->v) -> k -> v -> ST RealWorld () #-} }}} makes the performance the same. Digging much deeper, deep in the definition of the specialised insertWith function there is a call to `checkResize`. Without the pragma, this is specialised but with the pragma, it is not specialised and passed three dictionary arguments. Fixing `s` in the specialise pragma to `RealWorld`. also causes `checkResize` to be specialised which is why the performance improves. So it seems that the specialisation is not as good with the manual pragma as GHC is specialising more than the pragma in order to also remove a type argument. This means that other functions called by `insertWith` can also be specialised, if we do not specialise on `s` as well then their type will mention this type argument `s` which means that calls to this function are removed by `dumpBindUDs` at the top level. GHC itself will not remove the type argument `s` as we only specialise on dictionary arguments - it does seem like we might be able to do better here. If we were to further specialise `insertWith` in order to also remove the type argument then it would cause `checkResize` to specialise as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13873#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler