[GHC] #8793: Improve GHC.Event.IntTable performance

16 Feb 2014

      #8793: Improve GHC.Event.IntTable performance
------------------------------------+-------------------------------------
       Reporter:  cdk               |             Owner:
           Type:  task              |            Status:  new
       Priority:  normal            |         Milestone:  7.8.1
      Component:  libraries/base    |           Version:  7.6.3
       Keywords:                    |  Operating System:  Unknown/Multiple
   Architecture:  Unknown/Multiple  |   Type of failure:  None/Unknown
     Difficulty:  Unknown           |         Test Case:
     Blocked By:                    |          Blocking:
Related Tickets:                    |
------------------------------------+-------------------------------------
 The performance of `GHC.Event.IntTable` can be improved. I've managed to
 get some nice increases across the board. Benchmarking using `criterion`
 shows:

 function, % faster than current impl.
 `insert`: 4%
 `lookup`: 26%
 `update`: 11%
 `delete`: 5%

 There is one strange thing I noted. In `updateWith`, there is an inner
 loop that looks like this:

 {{{
 data Bucket a = Empty | Bucket Int a (Bucket a)

 go _ Empty = (False, Nothing, Empty)
 go cont (Bucket key val next)
     | key == k = case f val of
         Nothing -> (True, Just val, cont next)
         Just v  -> (False, Just val, cont (Bucket key v next))
     | otherwise = go (\x -> cont (Bucket key val x)) next
 }}}

 which returns a tuple that is immediately consumed like so:

 {{{
 (delete_occurred, old_val, new_bkt) <- go id ...
 when (isJust old_val) $ do
     <updateIntTable>
     when delete_occurred <decIntTableSize>
 return old_val
 }}}

 I expected that inlining the `<updateIntTable>` and `<decIntTableSize>`
 code blocks directly into `go` would result in better code than creating a
 tuple and then pattern matching on it afterwards. ie.

 {{{
 go _ Empty = return Nothing
 go cont (Bucket key val next)
     | key == k = do
         case f val of
             Nothing -> <updateIntTable> (cont next) >> <decIntTableSize>
             Just v  -> <updateIntTable> (cont (Bucket key v next)
         return (Just val)
     | otherwise = go (\x -> cont (Bucket key val x)) next
 }}}

 which has the exact same semantics. To my suprise, this code is almost 2x
 slower! The core generated in both cases is exactly what I'd expect; if
 anything, the second version seems tighter. I'm not sure why the first
 version is faster, but perhaps the original author, Bryan O'Sullivan, can
 shed some light as he used the tupled method in the first version.

 I'll attach my patch, `criterion`'s html output for the benchmarks as well
 as the benchmarking code, and the core for the oddity I discussed above.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler