[GHC] #8793: Improve GHC.Event.IntTable performance

#8793: Improve GHC.Event.IntTable performance ------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: new Priority: normal | Milestone: 7.8.1 Component: libraries/base | Version: 7.6.3 Keywords: | Operating System: Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | ------------------------------------+------------------------------------- The performance of `GHC.Event.IntTable` can be improved. I've managed to get some nice increases across the board. Benchmarking using `criterion` shows: function, % faster than current impl. `insert`: 4% `lookup`: 26% `update`: 11% `delete`: 5% There is one strange thing I noted. In `updateWith`, there is an inner loop that looks like this: {{{ data Bucket a = Empty | Bucket Int a (Bucket a) go _ Empty = (False, Nothing, Empty) go cont (Bucket key val next) | key == k = case f val of Nothing -> (True, Just val, cont next) Just v -> (False, Just val, cont (Bucket key v next)) | otherwise = go (\x -> cont (Bucket key val x)) next }}} which returns a tuple that is immediately consumed like so: {{{ (delete_occurred, old_val, new_bkt) <- go id ... when (isJust old_val) $ do <updateIntTable> when delete_occurred <decIntTableSize> return old_val }}} I expected that inlining the `<updateIntTable>` and `<decIntTableSize>` code blocks directly into `go` would result in better code than creating a tuple and then pattern matching on it afterwards. ie. {{{ go _ Empty = return Nothing go cont (Bucket key val next) | key == k = do case f val of Nothing -> <updateIntTable> (cont next) >> <decIntTableSize> Just v -> <updateIntTable> (cont (Bucket key v next) return (Just val) | otherwise = go (\x -> cont (Bucket key val x)) next }}} which has the exact same semantics. To my suprise, this code is almost 2x slower! The core generated in both cases is exactly what I'd expect; if anything, the second version seems tighter. I'm not sure why the first version is faster, but perhaps the original author, Bryan O'Sullivan, can shed some light as he used the tupled method in the first version. I'll attach my patch, `criterion`'s html output for the benchmarks as well as the benchmarking code, and the core for the oddity I discussed above. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------ Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 7.8.1 Component: libraries/base | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Changes (by cdk): * status: new => patch -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance --------------------------------------------+------------------------------ Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 7.10.1 Component: libraries/base | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Changes (by hvr): * cc: hvr (added) * failure: None/Unknown => Runtime performance bug * milestone: 7.8.1 => 7.10.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance --------------------------------------------+------------------------------ Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 7.10.1 Component: libraries/base | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by tibbe): cdk, thanks for the patch. It would be much easier to review the patch if it didn't also reformat all the code. Please try to respect the original author's style. We don't want "edit wars" where people keep reformatting code back and forth to fit their style. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance --------------------------------------------+------------------------------ Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 7.10.1 Component: libraries/base | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by bos): Hi, cdk, I'll echo Johan's feedback: your patch is not reviewable. I have read it for several minutes, and I cannot tell what or where your real changes are, as opposed to all of the gratuitous reformatting you did. (As Johan mentions, the reformatting is discourteous, as well as obscuring the substance of the change.) If you would like this change to go in, please resubmit it as one or two minimal patches that do not reformat code except as necessary for semantic purposes, and that make it easy for a reviewer to tell what is going on. Thanks. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: infoneeded Priority: normal | Milestone: 7.10.1 Component: | Version: 7.6.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Unknown Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by thoughtpolice): * cc: ekmett (added) * status: patch => infoneeded -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: infoneeded Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jscholl): I tried to collect the relevant changes from the patch, but in most cases could not reproduce the speedups cdk reported. Only for the {{{lookup}}} function the runtime changed measurable (it improved even more, around 78% faster than the current implementation). Pulling the {{{ForeignPtr}}} out of the {{{IORef}}} could make sense, but I could not measure an impact on performance. A thing I find surprising is the first argument in {{{updateWith.go}}}, which seems to be totally unused. Is this the correct behavior? Anyway, I created a second patch which removes the unused argument (and changes the first element of the return value to a {{{Bool}}}, as it is just tested for being {{{Nothing}}}), if it is just a leftover from an earlier version of the function. The change has no impact on performance, though. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * status: infoneeded => patch Comment: Okay, I just get {{{IndexError: pop from empty list}}} if I try to attach a file, so I put it here... Improving {{{lookup}}}: {{{ --- a/GHC/Event/IntTable.hs +++ b/GHC/Event/IntTable.hs @@ -45,11 +45,12 @@ lookup :: Int -> IntTable a -> IO (Maybe a) lookup k (IntTable ref) = do let go Bucket{..} - | bucketKey == k = return (Just bucketValue) + | bucketKey == k = Just bucketValue | otherwise = go bucketNext - go _ = return Nothing + go _ = Nothing it@IT{..} <- readIORef ref - go =<< Arr.read tabArr (indexOf k it) + bkt <- Arr.read tabArr (indexOf k it) + return (go bkt) new :: Int -> IO (IntTable a) new capacity = IntTable `liftM` (newIORef =<< new_ capacity) }}} Cleaning up {{{updateWith}}}: {{{ --- a/GHC/Event/IntTable.hs +++ b/GHC/Event/IntTable.hs @@ -13,7 +13,7 @@ import Data.Bits ((.&.), shiftL, shiftR) import Data.IORef (IORef, newIORef, readIORef, writeIORef) -import Data.Maybe (Maybe(..), isJust, isNothing) +import Data.Maybe (Maybe(..), isJust) import Foreign.ForeignPtr (ForeignPtr, mallocForeignPtr, withForeignPtr) import Foreign.Storable (peek, poke) import GHC.Base (Monad(..), (=<<), ($), const, liftM, otherwise, when) @@ -123,20 +123,17 @@ updateWith f k (IntTable ref) = do it@IT{..} <- readIORef ref let idx = indexOf k it - go changed bkt@Bucket{..} - | bucketKey == k = - let fbv = f bucketValue - !nb = case fbv of - Just val -> bkt { bucketValue = val } - Nothing -> bucketNext - in (fbv, Just bucketValue, nb) - | otherwise = case go changed bucketNext of + go bkt@Bucket{..} + | bucketKey == k = case f bucketValue of + Just val -> let !nb = bkt { bucketValue = val } in (False, Just bucketValue, nb) + Nothing -> (True, Just bucketValue, bucketNext) + | otherwise = case go bucketNext of (fbv, ov, nb) -> (fbv, ov, bkt { bucketNext = nb }) - go _ e = (Nothing, Nothing, e) - (fbv, oldVal, newBucket) <- go False `liftM` Arr.read tabArr idx + go e = (True, Nothing, e) + (del, oldVal, newBucket) <- go `liftM` Arr.read tabArr idx when (isJust oldVal) $ do Arr.write tabArr idx newBucket - when (isNothing fbv) $ + when del $ withForeignPtr tabSize $ \ptr -> do size <- peek ptr poke ptr (size - 1) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): Thanks for picking this up. Could you maybe try submitting your patches to [wiki:Phabricator], so the build bot can validate them.
A thing I find surprising is the first argument in updateWith.go, which seems to be totally unused. Is this the correct behavior? Maybe you could tell us? This code was added in 28cf2e004da0fc809ce9efff0802b125b3501e91 (#8158).
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): * Do we have any (reproducible) evidence showing that this change makes things go faster? Perhaps a benchmark test of the relevant functions themselves? That would be helpful. Once we have that evidence, we can go ahead and commit. * Secondly, are the changes something that a reasonable person might expect the optimiser should be able to do by itself? If so, why doesn't it? Let's not close the ticket until we know about this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jscholl): I did take another look at the {{{lookup}}} function and compared the core and the generated cmm in both cases. There was nothing special to see in the core as both versions compile to simple loops without any unexpected things like a dictionary floating around. The IO version only packs the result into an unboxed tuple with the state token while the pure version lacks this (of course). The cmm of both versions sheds more light on the reason for the speedup: GHC compiles the pure version to a nice loop which does not jump back to the beginning of the function, but behind the stack check (the stack is needed to evaluate unevaluated buckets), while the IO version just calls itself recursively (i.e. jumps before the stack check). Otherwise they seemed pretty identical as far as I could tell. And sadly my measurement of the speedup was wrong as I only took the original benchmark from cdk and modified it as necessary. The original version consumes all my memory as criterion runs benchmarks multiple times and mutable data structures are somewhat sensitive to such a behavior, especially if the insert operation extends already existing elements. So the first thing I did was replacing the lists with sets, which are less sensitive if an existing element is inserted a second time. Another important thing was the order in which benchmarks are run: If we first insert in all competing implementations, the second implementation suffers from a heap with more live data, meaning longer GC delays. So I changed the order in such a way that first one implementation is run, then another one while the first one is already no longer reachable and thus will be GCed. A third thing I noticed (but only after looking at the core) was that my change actually increases the laziness of {{{lookup}}}. The original implementation would determine whether the result is a {{{Just}}} or {{{Nothing}}} while my implementation returns a thunk. I though I was safe by using {{{nfIO}}} in the benchmark to evaluate my result, but actually I did not read that code carefully and first missed the actual place the result is evaluated - or not. Fixing this yields a more reasonable speedup of something around 10-15% (which also fits the observed differences in the cmm much better). I can (try to) attach the modified benchmark, if this helps. I did also take another look at the unused argument in {{{updateWith.go}}} and am now quite sure that it is something leftover from the development of the function which has no actual function anymore and can safely be removed. I guess it would have never been in the original commit, but it seems hard or impossible to correctly identity such unused arguments (I mean, it is used, but only in a function which does not use it...). Especially, if such an argument would be used to pass a type to a recursive function, which just passes it on and uses ScopedTypeVariables to access the type, so it never touches the argument directly... And yes, I can hopefully try to submit the changes to Phabricator tomorrow. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

The cmm of both versions sheds more light on the reason for the speedup: GHC compiles the pure version to a nice loop which does not jump back to
#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Thank you for taking the time to get more insight. Insight is what we need to take sensible action! the beginning of the function, but behind the stack check (the stack is needed to evaluate unevaluated buckets), while the IO version just calls itself recursively (i.e. jumps before the stack check). Interesting, though I don't yet understand the details. Could you boil out a standalone example that demonstrates just this single issue? I.e. two versions of a function, one of which repeats the stack check and one of which doesn't, and show the code side by side?
but it seems hard or impossible to correctly identity such unused arguments (I mean, it is used, but only in a function which does not use it...).
Well GHC's strictness analyser should find exactly this case. I'm puzzled why it does not. Again, could you spare a moment to make a standalone reproducer for just this issue? Or at least a smallish function I can compile in isolation to see this argument not disappearing. Thanks Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * Attachment "IOLoop.hs" added. Example loop in IO -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * Attachment "IOLoop.simpl" added. Simplifier output (no suprises here) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * Attachment "PureLoop.cmm" added. Generated cmm for pure version -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * Attachment "IOLoop.cmm" added. Generated cmm for IO version -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jscholl):
Interesting, though I don't yet understand the details. Could you boil out a standalone example that demonstrates just this single issue? I.e. two versions of a function, one of which repeats the stack check and one of which doesn't, and show the code side by side?
but it seems hard or impossible to correctly identity such unused arguments (I mean, it is used, but only in a function which does not use it...).
Well GHC's strictness analyser should find exactly this case. I'm
If I understand cmm correctly, we jump to the start of $wa in line 34, so the IO version repeats the stack check. The corresponding instruction in the pure version is in line 33, here we jump to cCT, so behind our stack check. This should also be possible in the IO version as the function only uses constant stack space and if it was available once, it should stay available until we deallocate it, right? puzzled why it does not. Again, could you spare a moment to make a standalone reproducer for just this issue? Or at least a smallish function I can compile in isolation to see this argument not disappearing.
No, I did not mean the optimizer in this case. The unused argument is optimized out, I was just thinking whether GHC could warn if an argument is never used (as it does with unused variables), as this sometimes indicated unfinished code which should either be removed (or documented as such) or finished. But I think this comes with too many cases where you want an unused argument, either to pass in a type via proxy or to satisfy another functions expectations. And if a function is recursive, unused arguments have to be passed on. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * Attachment "bench.tar.gz" added. Benchmark with original implementation, my changes and cdks changes -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jscholl): I attached the modified benchmark from cdk which compares the current implementation found in base with cdks and my changes. Note that the {{{loopV_}}} function does not evaluate the returned results, so if someone wants to use that and makes {{{insertWith}}} somehow return its result lazy instead of in WHNF, these will be thrown away. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * cc: jstolarek (added) Comment: OK I've had a look. * I have not looked at, or tried to reproduce, the benchmark. Probably we can just trust you. Let's just commit your improved code to the library. Ben? * That leaves the question of pure vs IO loop, which I think you are saying is the cause of the performance difference. Have you benchmarked that separately? (Your `IOLoop.hs` program, that is.) * It is indeed very odd that the "IO loop" does not generate as good code as the "pure loop". I believe that for the pure loop we are getting a C-- optimisation that turns a tail call into a jump to a label, so called "loopification". But this isn't happening for IO loop. I've collected info about loopification here: [wiki:Commentary/Compiler/Loopification]. I'm copying Jan Stolarek who was the last person to work on this. It'd be great to return to it. It'd be good to pull out this particular loopification issue into a new ticket, if you could. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: patch Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D1742 Wiki Page: | -------------------------------------+------------------------------------- Changes (by jscholl): * differential: => Phab:D1742 Comment: I opened #11372 for the loopification issue. And added the differential to this ticket (which I think I should have done earlier, but somehow I thought this would be done automatically if I mention this ticket on phabricator). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance
-------------------------------------+-------------------------------------
Reporter: cdk | Owner:
Type: task | Status: patch
Priority: normal | Milestone: 8.0.1
Component: Core Libraries | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D1742
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: closed Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D1742 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8793: Improve GHC.Event.IntTable performance -------------------------------------+------------------------------------- Reporter: cdk | Owner: Type: task | Status: closed Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.6.3 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D1742 Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): This has been merged into both `master` and `ghc-8.0`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8793#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC