[GHC] #9398: Data.List.cycle is not a good producer

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: libraries/base | Version: 7.8.3 Keywords: | Operating System: Architecture: Unknown/Multiple | Unknown/Multiple Difficulty: Easy (less than 1 | Type of failure: Runtime hour) | performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Data.List.cycle is not a good producer. I'm not at all sure this solution is the best, but it allocates much less when mapped over and then folded. If we could make it a good consumer cheaply that would be nice too, but I imagine it's probably mostly applied to short lists, so that is probably not a priority. {{{#!hs {-# INLINE cycle #-} cycle :: [a] -> [a] cycle [] = error "Prelude.cycle: empty list" cycle xs = concat $ repeat xs }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: invalid | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: new => closed * resolution: => invalid Comment: No, not ready yet. There are some more subtleties. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: closed => new * resolution: invalid => Old description:
Data.List.cycle is not a good producer. I'm not at all sure this solution is the best, but it allocates much less when mapped over and then folded. If we could make it a good consumer cheaply that would be nice too, but I imagine it's probably mostly applied to short lists, so that is probably not a priority.
{{{#!hs {-# INLINE cycle #-} cycle :: [a] -> [a] cycle [] = error "Prelude.cycle: empty list" cycle xs = concat $ repeat xs }}}
New description: Data.List.cycle is not a good producer. I ''believe'' the following fixes it. The tests I've profiled so far suggest it does so. {{{#!hs {-# INLINE cycle #-} cycle :: [a] -> [a] cycle [] = error "Empty cycle." cycle xs = let cyc = augment cycle' cyc in cyc where cycle' c n = foldr c n xs }}} -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Remember that GHC doesn’t inline recursive things (such as `cyc`). Also, I don’t think your `cycle` is a good consumer: To inline it, you need to know that its argument is a `:`. But to make it a good consumer, the argument needs to be of the shape `build ...`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Remember that GHC doesn’t inline recursive things (such as `cyc`).
Also, I don’t think your `cycle` is a good consumer: To inline it, you need to know that its argument is a `:`. But to make it a good consumer,
#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:3 nomeata]: the argument needs to be of the shape `build ...`. I suppose I should (try to) find a way to rewrite it to another form if it doesn't get eaten by foldr. I'm not understanding your concern about the consumption side. `cyc` isn't inlined, but the enclosing `cycle` is. Shouldn't that be good enough to allow the `foldr` in `cycle'` to fuse with a `build` or `augment` in `xs`? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): But note that you are pattern matching on `xs`! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:5 nomeata]:
But note that you are pattern matching on `xs`!
Ah, I see what you mean. That pattern match, however, is not essential. Suppose we leave it out. Then we get {{{#!hs cycle [] = let cyc = augment cycle' cyc in cyc where cycle' c n = foldr c n [] }}} `foldr/nil` gives {{{#!hs cycle [] = let cyc = augment (\c n -> []) cyc in cyc }}} Then applying `augment`: {{{#!hs cycle [] = let cyc = [] in cyc }}} So we've turned `_|_` into `[]`, whereas a similar omission from the current definition of `cycle` would turn an error into a much-less- desirable non-termination. This is not a disaster, and in fact it makes sense from a monoid perspective. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): I think you meant {{{#!hs cycle [] = let cyc = augment cycle' cyc in cyc where cycle' c n = foldr c n [] cycle [] = let cyc = augment (\c n -> n) cyc in cyc cycle [] = let cyc = cyc in cyc cycle [] = ⊥ }}} so you turned a helpful error message into nontermination -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:7 nomeata]:
I think you meant {{{#!hs cycle [] = let cyc = augment cycle' cyc in cyc where cycle' c n = foldr c n []
cycle [] = let cyc = augment (\c n -> n) cyc in cyc
I don't think so. Substituting `[]` for `xs`, `cycle' c n = foldr c n [] = []`, so `cycle' = \c n -> []`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): `foldr c n [] = n`! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:9 nomeata]:
`foldr c n [] = n`!
I am a fool! That said, we could make an optimization flag, and maybe include it in -O2. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Not sure what you mean. An optimization flag that changes whether `cycle []` is an error or nontermination? I don’t think this is a good idea. Anyways, as you said: It is more important that `cycle` is a good producer. I don’t see how an enclosing `foldr` would get in touch with the `augment` in `let cyc = augment cycle' cyc in cyc`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): I think you need to move the knot-tying into the argument to build, e.g. something like {{{ cycle xs = build $ \c _ -> let cyc = foldr c cyc xs in cyc }}} (ignoring the issue of `cycle []` for now) It seems that this would be both a good produce, and possibly even a good consumer. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:12 nomeata]:
I think you need to move the knot-tying into the argument to build, e.g. something like {{{ cycle xs = build $ \c _ -> let cyc = foldr c cyc xs in cyc }}} (ignoring the issue of `cycle []` for now)
It seems that this would be both a good producer, and possibly even a good consumer.
I won't be able to test anything for some hours, but that does look very promising indeed. I wonder if GHC performed some transformation that turned mine into yours somehow, but yours is definitely much clearer and prettier in any case. I ''think'' it's a good producer, and a good consumer for `build`. It doesn't look like a perfect consumer for `augment` (although it's not a ''bad'' one), but that may be unavoidable. I believe `cycle` is one of the more popular list functions in production code, so personally I think it's probably worth giving up the error message on a null argument to buy a little performance, even on the consumption side, but it's also true that the production side is more important in general. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by isaacdupree): In compiled code, the latter definition throws <<loop>>, which is nice. Try `runghc test.hs` (nontermination) vs `ghc test.hs && ./test` (exception): {{{ import GHC.Exts cycle2 xs = build $ \c _ -> let cyc = foldr c cyc xs in cyc main = print (cycle2 [] !! 0 :: Int) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:12 nomeata]: I wrote some test functions, using the type {{{#!hs type BuildArg a = forall b . (a -> b -> b) -> b -> b }}} Generally things look pretty good (lots of Core to look at below, along with comments you should take with a grain of salt since I'm just a newbie). I found one case where your definition works significantly worse than the Prelude's. The bad case I've found: {{{#!hs main = print $ foldr (+) 0 $ take 30000000 $ map (* 13) $ cycle [1,8,4,0,(5::Int)] }}} For some reason, your `cycle` implementation allocates a lot, but the Prelude one runs in constant space. The basic test translations: {{{#!hs cycleBuild :: BuildArg a -> [a] cycleBuild g = cycle (build g) }}} produces {{{#!hs cycleBuild cycleBuild = \ @ a_a3jG g_a3hO -> letrec { cyc_a2iD cyc_a2iD = g_a3hO (:) cyc_a2iD; } in cyc_a2iD }}} which is obviously good, and obviously better than the result with `Prelude.cycle`, which is {{{#!hs cycleBuild cycleBuild = \ @ a_a1Xy g_a1zF -> case g_a1zF (:) ([]) of wild_a2aX { [] -> cycle1; : ipv_a2b3 ipv1_a2b4 -> letrec { xs'_a2b1 xs'_a2b1 = ++ wild_a2aX xs'_a2b1; } in xs'_a2b1 } }}} {{{#!hs cycleAugment :: BuildArg a -> [a] -> [a] cycleAugment g xs = cycle (augment g xs) }}} produces {{{#!hs cycleAugment cycleAugment = \ @ a_a3j1 g_a3hS xs_a3hT -> letrec { cyc_a2iD cyc_a2iD = g_a3hS (:) (++ xs_a3hT cyc_a2iD); } in cyc_a2iD }}} which I think is likely the best we can do, and clearly better than what the Prelude gives: {{{#!hs cycleAugment cycleAugment = \ @ a_a1WT g_a1zJ xs_a1zK -> case g_a1zJ (:) xs_a1zK of wild_a2aX { [] -> cycle1; : ipv_a2b3 ipv1_a2b4 -> letrec { xs'_a2b1 xs'_a2b1 = ++ wild_a2aX xs'_a2b1; } in xs'_a2b1 } }}} {{{#!hs foldCycleBuild :: (a -> b -> b) -> b -> BuildArg a -> b foldCycleBuild c n g = foldr c n (cycle (build g)) }}} produces {{{#!hs foldCycleBuild foldCycleBuild = \ @ a_a3jl @ b_a3jm c_a3hP _ g_a3hR -> letrec { cyc_a2iD cyc_a2iD = g_a3hR c_a3hP cyc_a2iD; } in cyc_a2iD }}} which is obviously good, and ''incomparably'' better than what the Prelude gives, which I will not paste here because it is very obviously very much inferior. The rest of the tests are a little harder for me to interpret, because the nested letrecs confuse me, but I think they're probably good too. {{{#!hs foldCycle c n xs = foldr c n (cycle xs) }}} produces {{{#!hs $wfoldCycle $wfoldCycle = \ @ a_a3lf @ b_a3lg w_s3mE w1_s3mG -> letrec { cyc_a2iD cyc_a2iD = letrec { go_a2GG go_a2GG = \ ds_a2GH -> case ds_a2GH of _ { [] -> cyc_a2iD; : y_a2GM ys_a2GN -> w_s3mE y_a2GM (go_a2GG ys_a2GN) }; } in go_a2GG w1_s3mG; } in cyc_a2iD foldCycle foldCycle = \ @ a_a3lf @ b_a3lg w_s3mE _ w2_s3mG -> $wfoldCycle w_s3mE w2_s3mG }}} {{{#!hs foldCycleAugment :: (a -> b -> b) -> b -> BuildArg a -> [a] -> b foldCycleAugment c n g xs = foldr c n (cycle (augment g xs)) }}} produces {{{#!hs $wfoldCycleAugment $wfoldCycleAugment = \ @ a_a3hx @ b_a3hy w_s3mD w1_s3mF w2_s3mG -> letrec { cyc_a2iD cyc_a2iD = w1_s3mF w_s3mD (letrec { go_a2GG go_a2GG = \ ds_a2GH -> case ds_a2GH of _ { [] -> cyc_a2iD; : y_a2GM ys_a2GN -> w_s3mD y_a2GM (go_a2GG ys_a2GN) }; } in go_a2GG w2_s3mG); } in cyc_a2iD foldCycleAugment foldCycleAugment = \ @ a_a3hx @ b_a3hy w_s3mD _ w2_s3mF w3_s3mG -> $wfoldCycleAugment w_s3mD w2_s3mF w3_s3mG }}} which I have no idea what to think of, honestly. {{{#!hs mapCycle f xs = map f (cycle xs) }}} produces {{{#!hs mapCycle mapCycle = \ @ a_a3lA @ b_a3lB f_a3hJ xs_a3hK -> letrec { cyc_a2iD cyc_a2iD = letrec { go_a2GG go_a2GG = \ ds_a2GH -> case ds_a2GH of _ { [] -> cyc_a2iD; : y_a2GM ys_a2GN -> : (f_a3hJ y_a2GM) (go_a2GG ys_a2GN) }; } in go_a2GG xs_a3hK; } in cyc_a2iD }}} which looks sane enough—it just maps over `xs` and then cycles the result. I also took a look at {{{#!hs cycleMap f xs = cycle (map f xs) }}} This produced exactly the same Core, which I think is a positive sign. The Prelude implementation, on the other hand, gives a rather terrible result: {{{#!hs mapCycle mapCycle = \ @ a_a2aw @ b_a2ax f_a1x7 xs_a1x8 -> case xs_a1x8 of wild_a2aV { [] -> case cycle1 of wild1_00 { }; : ipv_a2b1 ipv1_a2b2 -> letrec { xs'_a2aZ xs'_a2aZ = ++ wild_a2aV xs'_a2aZ; } in map f_a1x7 xs'_a2aZ } }}} Yes, it actually cycles the argument and then maps over the result—horrible in every way. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata):
Yes, it actually cycles the argument and then maps over the result—horrible in every way.
But this is well-known: You usually cannot “modify” cyclic data structures without breaking them. Of course its cool that with the above definition of `cycle` + fusion, we can suddenly `map` over a cyclic structure without breaking it, but maybe that’s a tad too much magic? Especially as people probably don’t have a good feeling for when this happens and when not? OTOH, I don’t think it hurts either, and nice surprises are – well – nice. Did you investigate why `foldr (+) 0 $ take 30000000 $ map (* 13) $ cycle [1,8,4,0,(5::Int)]` allocates more? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Yes, it actually cycles the argument and then maps over the result—horrible in every way.
But this is well-known: You usually cannot “modify” cyclic data structures without breaking them.
Of course its cool that with the above definition of `cycle` + fusion, we can suddenly `map` over a cyclic structure without breaking it, but maybe that’s a tad too much magic? Especially as people probably don’t have a good feeling for when this happens and when not? OTOH, I don’t
#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:16 nomeata]: think it hurts either, and nice surprises are – well – nice. Well, I agree that magic can be a problem, but if it falls out of something that tends to be good, that's not a major issue.
Did you investigate why `foldr (+) 0 $ take 30000000 $ map (* 13) $ cycle [1,8,4,0,(5::Int)]` allocates more?
I'm not entirely sure, but it looks to me like it's all about unboxing. Somehow, the Prelude version is able to unbox all the `Int`s, whereas this one is not. If I replace the `(+)` with {{{#!hs {-# NOINLINE g #-} g a b = a }}} then both versions run in constant space. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:16 nomeata]: OK, thinking about this more clearly, I think I see some things I did not realize before, and I am less and less confident that any of this was a good idea. Aside from the magical map, and the pleasantness of the form when fused on both sides, I think this whole thing probably does more harm than good. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Don’t be discouraged! I think this is very much worth it, and if people use `cycle` in list comprehensions (which they likely do), there might be real-world benefit in this. We just need to avoid regressions. I tried to reproduce your findings. I do observe the higher allocation, but the accumulator `Int#` is still unboxed. The extra allocations seem to stem from the `ys1` allocated here: {{{ Rec { go :: [GHC.Types.Int] -> GHC.Prim.Int# -> GHC.Types.Int go = \ (ds :: [GHC.Types.Int]) -> case ds of _ { [] -> Main.main_cyc; : y ys -> let { ys1 :: GHC.Prim.Int# -> GHC.Types.Int ys1 = go ys } in \ (eta :: GHC.Prim.Int#) -> case GHC.Prim.tagToEnum# (GHC.Prim.<=# eta 1) of _ { GHC.Types.False -> case y of _ { GHC.Types.I# x -> case ys1 (GHC.Prim.-# eta 1) of _ { GHC.Types.I# y1 -> GHC.Types.I# (GHC.Prim.+# (GHC.Prim.*# x 13) y1) } }; GHC.Types.True -> case y of _ { GHC.Types.I# x -> GHC.Types.I# (GHC.Prim.*# x 13) } } } Main.main_cyc :: GHC.Prim.Int# -> GHC.Types.Int Main.main_cyc = go lvl9 end Rec } }}} This is an interesting case. It seems to be a shortcoming in the call arity analysis: One might think it should be able to infer that `go` is always called with two arguments (and hence move the `\eta` out of the `let` and `case`). But it (rightfully) doesn’t do that because `main_cyc` is a thunk, and eta-expanding it would duplicate the case analysis done by `case ds`. The regular arity analysis also refuses to improve that code, for a similar reason: It considers to move the `\eta` up, but it would escape the `let ys1`, and the arity analysis is unable to determine if that would be expensive or not. Interesting case! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Staring at this case a bit more, I conclude that we will never get good code from fusing a higher order foldr (like `foldl` or `take` or anything with an accumulating parameter) with a cyclic producer. It will sucessfully tie the know, but a knot of type `Int# -> Int` (and we can see this happening here). So it will create 5 function closures of that type that call each other in a round-robin style; the pattern match on the list elements happens only once. And I as there is no way to pull the accumulating argument into the knot (after all, it changes while going round the circle), this can hardly be avoided. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.8.4 Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: invalid | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * status: new => closed * resolution: => invalid Comment: Now that we've discussed this to death, I think it's time to close it up as a lesson learned. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9398: Data.List.cycle is not a good producer -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: bug | Status: closed Priority: normal | Milestone: Component: | Version: 7.8.3 libraries/base | Keywords: Resolution: invalid | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by thoughtpolice): * milestone: 7.8.4 => -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9398#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC