[GHC] #8598: IO hack in demand analyzer gets in the way of CPR

#8598: IO hack in demand analyzer gets in the way of CPR
------------------------------------+-------------------------------------
Reporter: nomeata | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Keywords: | Operating System: Unknown/Multiple
Architecture: Unknown/Multiple | Type of failure: None/Unknown
Difficulty: Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: |
------------------------------------+-------------------------------------
After a lot of staring at code and comparing unexpected nofib results I
found the following:
The IO hack in the demand analyzer (see `dmdAnalAlt` in `StrAnal.lhs` and
#1592 for a good discussion) prevents CPR in any function that uses a C
call. This is a small example, reduced from the `scaleFloat` method for
doubles:
{{{
module Float(fun) where
import GHC.Float (Double(..))
import GHC.Integer (decodeDoubleInteger, encodeDoubleInteger)
fun :: Double -> Double
fun x | isFix = x
| otherwise = case x of
(D# x#) -> case decodeDoubleInteger x# of
(# i, j #) -> D# (encodeDoubleInteger i j)
where
isFix = isDoubleFinite x == 0
foreign import ccall unsafe "isDoubleFinite" isDoubleFinite :: Double ->
Int
}}}
Here, `fun` does current not get the CPR property, and the work gets type
`GHC.Prim.Double# -> GHC.Types.Double`. Why? Because in core, there will
be a
{{{
case {__pkg_ccall main isDoubleFinite GHC.Prim.Double#
-> GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld,
GHC.Prim.Int# #)}_dQD
ds_dQA GHC.Prim.realWorld#
of _ [Occ=Dead, Dmd=] #) ->
...
}}}
where the case body has `DmdType m {dQz->

#8598: IO hack in demand analyzer gets in the way of CPR
-------------------------------------+------------------------------------
Reporter: nomeata | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by nomeata):
A shower later I believe I have a quite decent solution:
Currently, the CPR lattice has these elements:
{{{
NoCPR
/ \
ReProd RetSum
\ /
BotCPR
}}}
where `BotCPR` not only tells us that there is not going to be a result,
but also that the function is ''diverging'' (and hence puts a hyperstrict
demand on free variables). For the hypothesised „clean exit of the IO
function“ we need something in between:
{{{
NoCPR
/ \
ReProd RetSum
\ /
ExitCPR
|
BotCPR
}}}
`ExitCPR` behaves like `BotCPR` when with regard to `lub`ing it with
another result (i.e. whatever the other things knows about the result
stays there), but has a `

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Mail from SPJ: Well spotted. I'm on a train, hence email response. Maybe you can paste this into the ticket? There are two different issues here. '''First''', `isDoubleFinite` is declared as non-side-effecting: {{{ foreign import ccall unsafe "isDoubleFinite" isDoubleFinite :: Double -> Int }}} But (as you can see from the code you give) we currently desugar it into something that looks (to Core) as though it might have a side effect, or raise a (synchronous) exception. That is stupid. How might we fix that? I can think of two ways. * Generate a `FCallId` whose type is `Double -> Int` rather than (as now) `Double -> IO Int`. There would be a few knock-on consequences to make sure they were correctly code-generated. I like this path best, because it reflects the truth. * Currently {{{ IO a = State# RealWorld# -> (# State# RealWorld#, a #) }}} For these non-side-effecting things we could instead generate a `FCallId` with a type involving `SafeIO` instead of `IO`: {{{ SafeIO a = State# SafeWorld# -> (# State# SafeWorld#, a #) }}} The different "world token" would express the idea that the function can't throw an exception. I don't like this as much, but it might in any case be useful for things that ''do'' have side effects but ''don't'' throw exceptions. My preference is for the first. '''Second''', as you point out, consider {{{ f x = do { when (x>3) exit ; return (True, False) } }}} Function `f` might throw an exception or exit rather than returning, but ''if it does return'' it will certainly have the CPR property. So yes, CPR-ness is quite safe. It's not quite so obvious for divergence: {{{ g x = do { when (x>3) exit ; g x } h 0 y = y h x y = g x }}} Is `h` strict in `y`? You might say (reasonably) that we can ignore the possible IO exception/exit in `g` when figuring out that `g` is sure to diverge. If so, we'd say that `h` is strict in `y`. But if `(x>3)` then really `y` is not evaluated... and spotting that is exactly what the IO hack in the demand analyser is spotting. So I think it is ''not'' safe to propagate divergence information. In short, CPR info yes, guaranteed-divergence no. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata):
In short, CPR info yes, guaranteed-divergence no.
Which is precisely what we obtain from adding `ExitCPR` to the lattice, as {{{ExitCPR `lub` BotCPR = ExitCPR}}}. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Nofib believes that this is a good change (runtime not reliable, as I ran other stuff in parallel): {{{ -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- fft2 -0.0% -0.1% 0.10 0.10 +0.0% mandel -0.0% -9.2% 0.10 0.10 +0.0% nucleic2 -0.0% -0.3% 0.10 0.10 +0.0% -------------------------------------------------------------------------------- Min -0.1% -9.2% -3.4% -4.1% +0.0% Max -0.0% +0.0% +18.5% +19.3% +0.0% Geometric Mean -0.0% -0.1% +1.7% +1.6% -0.0% }}} I’ll push it to branch `wip/T8598` for review. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Also note that the current IO hack is very erratic with regard to IO inside `unsafePerformIO` (including non-IO FFI calls): It only zaps strictness demands if it actually sees the `IO` type. If, in the example above, `isDoubleFinite` were not inlined, but a separate function of type `Double -> IO Bool`, it would behave different. Very unsatisfying. This proposal should eliminate the need for special-casing in `StrAnal` completely, and make the analysis more precise. We could elaborate the `DmdResult` lattice some more (and I include the `Converges` from the nested-cpr-branch here, to show how that goes together). Basically we want the `DmdResult` to keep track of: * whether function may or will diverge. * whether the function may or will exit cleanly (this is new) * if it returns, what is the result. So a first approximation of that would be {{{ #!haskell data CPR = NoCRP | RetCon Int [DmdResult] data TriState = Yes | No | Maybe -- Maybe is top data DmdResult = DmdResult { diverges :: TriState, exits :: TriState, cpr :: CPR } -- Product lattice }}} which is easy to understand and handle, but it has some invalid states, i.e. if it definitely exits or diverges, we do not really want a `CPR` field, and also not both `diverges` and `exists` should be `True` at the same time. A format that captures it more precisely would be {{{ #!haskell data DmdResult = Diverges | Exits | DmdResult { mayExit :: True, mayDiverge :: True, cpr :: CPR} }}} which neatly expresses definite convergence with `DmdResult False False cpr`. I like that. So the idea is now that primitive operations have their `DmdResult` annotation manually (and I guess most of them do not exit). FFI calls either get a conservative default of `DmdResult True True NoCPR` for things in the `IO` monad and `DmdResult False True NoCPR` for the others. Then in the demand analyser no special handling of IO is required any more. Instead, in `bothDmdType`, one has to handle the `mayExit` flag of the right argument and possibly zap strictness, just like we do already for diverging things. And even `IO`-infested code will, as long as it does not call anything with `mayExit = True` (e.g. numerical calculation, working with arrays and `IORef`s), will get good strictness demands inferred. How does that sound? Or is it overkill adding this to the demand types, given that (hopefully) almost everything will have `mayExit = False`? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Replying to [comment:4 nomeata]:
I’ll push it to branch `wip/T8598` for review.
The branch validates. Should I push this, independent of the extended approach suggested later? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR
-------------------------------------+------------------------------------
Reporter: nomeata | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by Joachim Breitner

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: T8598 | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Changes (by nomeata): * testcase: => T8598 Comment: My extended approach suggested above has issues. For example, what should be the `DmdResult` of a function parameter be? Lets consider {{{f x y = x `seq` y `seq` ...}}} and look at the `DmdResult` of `x`. Theoretically, we’d want to be conservative, and hence `DmdResult { mayExit = True, mayDiverge = True, cpr = NoCPR }`. But that would loose far too much strictness information, in particular we’d be lazy in `y`, which is undesirable. OTOH, if we assume `DmdResult { mayExit = False, mayDiverge = True, cpr = NoCPR }` for `x` it would be wrong if we pass something that can cleanly exit to `f`. `unsafePerformIO exitSuccess` would be such a thing – but we do not really care about that, right? If we take the stance that clean exists should only really be happening when evaluating `IO` (and any other kind exiting is allowed to be reordered with other evaluations), then a still correct, but more precise analysis that we have right now is to use the elaborated lattice described in comment:5 and move the ''IO hack'' from case expressions to arguments: Use `DmdResult { mayExit = True, mayDiverge = True, cpr = NoCPR }` for arguments of `IO` type, and `DmdResult { mayExit = False, mayDiverge = True, cpr = NoCPR }` for all others. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: T8598 | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by nomeata): Simplified my fix in [http://git.haskell.org/ghc.git/shortlog/refs/heads/wip/T8598 `wip/T8598`] (did not add a new element to the lattice, rather added a simple unary function to defer a !DmdType when it occurs after an IO action). Code is currently validating, and will be pushed afterwards. This branch also contains a patch “Rename topDmdType to nopDmdType”:
because topDmdType is ''not'' the top of the lattice, as it puts an implicit absent demand on free variables, but Abs is the bottom of the Usage lattice.
Why nopDmdType? Becuase it is the demand of doing nothing: Everything lazy, everything absent, no definite divergence.
This ticket stays open to track the '''first''' issue SPJ mentioned in comment:2. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR
-------------------------------------+------------------------------------
Reporter: nomeata | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: T8598 | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by Joachim Breitner

#8598: IO hack in demand analyzer gets in the way of CPR
-------------------------------------+------------------------------------
Reporter: nomeata | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: T8598 | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Comment (by Joachim Breitner

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------- Reporter: nomeata | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: T8598 Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by akio): I have come across an example where adding a new item to the lattice is actually useful: {{{#!hs {-# LANGUAGE MagicHash #-} {-# LANGUAGE UnboxedTuples #-} module Float(fun) where import GHC.Exts import GHC.Integer (decodeDoubleInteger, encodeDoubleInteger) fun :: Double -> Double fun x@(D# x#) | x == 0 = 0 | otherwise = case isDoubleFinite x of I# y# -> case decodeDoubleInteger x# of (# i, j #) -> fun (D# (encodeDoubleInteger i (y# +# j))) foreign import ccall unsafe "isDoubleFinite" isDoubleFinite :: Double -> Int }}} Here `fun` is a tail-recursive function, and I think it can be CPRed just fine. However the current GHC fails to give `fun` a CPR property because it removes too much information in `deferAfterIO`. Specifically: 1. In the first iteration of the fixpoint calculation, the recursive to call to `fun` gets the CPR result of `Diverges`. 2. However, the case expression that scrutinizes `isDoubleFinite x` causes a call to `deferAfterIO`, which turns the CPR result to `Dunno NoCPR`. 3. This in turn means the whole function doesn't get any useful CPR property. If we had `ExitCPR` (or `Exits` to match the current naming convention) in the lattice, then the step (2) would have returned `Exits` instead of `Dunno NoCPR`, and allowed `fun` to get a useful CPR property. I don't know how often this comes up in practice. With nested CPR, this will become a much bigger problem because it essentially means no tail- recursive IO function can get a useful CPR property. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------- Reporter: nomeata | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: T8598 Blocked By: | Blocking: Related Tickets: #1600 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by akio): * cc: akio (added) * related: => #1600 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------- Reporter: nomeata | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: T8598 Blocked By: | Blocking: Related Tickets: #1600 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): For pure ccalls like this one we have no business messing with IO at all. I'd love to implement the first bullet of comment:2. What you say may also be true of tail-recursive functions that ''do'' perform IO, but there would be fewer cases where it mattered. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8598: IO hack in demand analyzer gets in the way of CPR -------------------------------------+------------------------------------- Reporter: nomeata | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.6.3 Resolution: | Keywords: CPRAnalysis Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: T8598 Blocked By: | Blocking: Related Tickets: #1600 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => CPRAnalysis -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8598#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC