[GHC] #12364: Demand analysis for sum types

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- While working on #10613 it crossed my mind that it might be worthwhile to expand the demand type also onto sum types. So instead of {{{ | UProd [ArgUse] -- Product }}} we would have {{{ | UData [[ArgUse]] -- Data type }}} and a function like {{{ fromMaybe :: a -> Maybe a -> b }}} would have a signature of {{{ <1*U><1*U(;1*U)> }}} which indicates that the second argument of `fromMaybe` is evaluated at most once; the first constructor of the result has no arguments, the second (separated by `;`) has one argument which is also used at most once. I could imagine that this gives a lot of useful information with parsers and other code that repeatedly retuns stuff wrapped in a `Maybe` or similar type. Now, sum types are often recursive (`[]`…), and we probably want to be smarter about recursion here. But note that this is not a new problem. If you write {{{ data Stream = Stream Int Stream Stream Stream foo (Stream 0 x y z) = 0 foo (Stream 1 x y z) = foo x foo (Stream 2 x y z) = foo y foo (Stream _ x y z) = foo z }}} you already get huge demand signatures. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): Or maybe simply `UData [ArgUse]` and then have a flat list of all the constructor’s arguments. Implementation detail. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * cc: osa1 (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): Work in progress on branch wip/T12354, just a quick experiment that might be able to tell us if this is worth pursuing. But it is possibly less useful than I thought: {{{ let foo a b = ... either return Left foobar .... or return Right foobaz in ... case foo a b of Left x [Demand=1*U] -> ... ; Right y [Demand=1*U] -> }}} Because the demand analyzer is a backwards analysis, the information on the usage of `foo` does not propagate into the definition of `foo` and into `foobar` and `foobaz`. For functions returning a product, a product demand is simply assumed `[Product demands for function body]` (but even there, not with single-use information). So I doubt that this is a direction worth pursuing. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Expanding demand analysis for sums will get info for {{{ f :: Maybe Int -> Int f Nothing = 3 f (Just x) = x + 1 }}} Currently if a call site looks like `f (Just (p+q))` we'll build a thunk for `p+q`. But `f` always evaluates it, so that thunk is useless; we could use call-by-value. When we have unboxed sums, we can do worker/wrapper for `f` too. In contrast, you seem to be thinking about the benefits for nested CPR perhaps, in comment:3 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata):
Currently if a call site looks like f (Just (p+q)) we'll build a thunk for p+q. But f always evaluates it, so that thunk is useless; we could use call-by-value.
Actually inlining would require strictness analysis, wouldn’t it? Extending that to sums might work, but I was looking into demand analysis right now. There, we would be able to turn the thunk into a non-updateable one. I’m sure I was not confusing it with CPR, but I guess in comment:3 I was thinking about let-up vs. let-down, and that some examples would look better if the analysis would work the other way. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): osa1 mentioned that they actually have a working implementation of this in https://github.com/osa1/ghc/blob/rebase-second- try/compiler/basicTypes/Demand.hs#L188, so I’ll stop working on my copy. My code is still in wip/T12364, but probably not much use. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Yes, we implemented it with @jmct and it worked fine in our tests. Only problem is: We don't have a recursive type check, so for lists etc. it takes really long time (because it iterates 20 times as that's the hard- coded hard-limit for demand analysis iterations, specified in DmdAnal.hs). We also implemented CPR if you need that... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Our branch is big because we do worker/wrapper for sums, CPR, unpacking sum fields in DataCons etc. If you only need demand analysis I suggest just copying Demand.hs and DmdAnal.hs. You may then need to fix worker/wrapper pass so that it won't crash when it sees a sum demand or cpr etc. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jmct): Hey everyone, I hope it's okay if I chime in for a second. I was also originally writting this comment on #12368 but I think it's a bit more relevant here. Because of what Simon says on ticket:12368#comment:1, fixpoint convergence should not be used to perform strictness analysis on recursive sum types. I could probably dig up some examples of where it produces incorrect results if you'd like. The standard way of handling recursive types is to ensure they're regular types, which allows you to assume a 'uniform' demand on the type. The frustrating thing when trying to apply this to Haskell is polymorphic recursion which would allow for non-uniform demands on a recursive type. For the unboxing of sum-types we wanted to get around this by checking whether we were analysing a recursive sum type, but that turns out to be difficult in GHC at the moment: https://mail.haskell.org/pipermail/ghc- devs/2016-March/011526.html However, that only works for us because we only want to unbox non- recursive things anyway. If you definitely want to analyse recursive types then some new theory is going to have to be worked out. The unexplored parts are strictness on nested types and the higher-kinded polymorphism possibly allowing for introduction of loops. I've thought a bit about how to do it but haven't had a serious go at it. During my thesis defense Prof. Mycoft suggested the problem might be solvable using a PER-based analysis, which has been shown to generalize projection-based analyses. I haven't looked at that approach yet, but I did scribble down a particular thesis he told me to read. For some background, Hinze's thesis is (IMO) the best this topic. I've read and re-read his thesis several times in the past few years and I still get new insight from it each time. You can find it on his site here: http://www.cs.ox.ac.uk/ralf.hinze/publications/#D2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata):
I could probably dig up some examples of where it produces incorrect results if you'd like.
Isn’t it like that similar example will work for recursive product types, and hence be (likely obscure and rare, but still real) bugs in the current code? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jmct): Possibly, but GHC's much better about detecting recursiveness for product types, see the large section titled "Deciding which type constructors are recursive" in compiler/typecheck/TcTyDecls.hs. In particular there's this section: {{{ The "recursive" flag for algebraic data types is irrelevant (never consulted) for types with more than one constructor. An algebraic data type M.T is "recursive" iff it has just one constructor, and (a) it is declared in an hi-boot file (see RdrHsSyn.hsIfaceDecl) (b) it is declared in a source file, but that source file has a companion hi-boot file which declares the type or (c) one can get from its arg types to T via type synonyms, or by non-recursive newtypes or non-recursive product types in M e.g. data T = MkT (T -> Int) Bool }}} I'm going to speculate wildly and say that it just hasn't been that important for GHC to detect recursiveness in sum types in the past, and therefore it just hasn't gotten very much attention. There probably shouldn't be a difference in theory, but GHC as it is today is better at avoiding recursiveness in product types and therefore would be better at avoiding the issue in that case. Though you're right in principle. Tomorrow morning I'll dig up my old notes and comment on #12368 giving an example. I don't want to confuse the two issues too much, though they're intimately related. #12368 is really about recursion in the _functions_ though, which is an important distinction (and I definitely have examples where the current behavior of the 'cunning plan' is necessary). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): I was about to say “But this flag is not checked anywhere in the demand analyzer” (because it is) and did some git archeology, and I found commit changeset:3e7e5ba8333d318c38b4cfc538a97fdca0aed5b1/ghc which replaced the use of `isRecursiveTyCon` with one using `RecTcChecker`. From my rough reading is that we look one level deep ''on every iteration'', so we still get deeply nested result. I was able to produce an unsound result this way, but it belongs to #12368, so I added it there as a comment. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12364: Demand analysis for sum types -------------------------------------+------------------------------------- Reporter: nomeata | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: | DemandAnalysis CPRAnalysis Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by sgraf): * keywords: => DemandAnalysis CPRAnalysis -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12364#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC