[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 4 commits: Refactor eta-expansion in Prep

31 Mar 2026

Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
e9623c94 by Simon Peyton Jones at 2026-03-31T21:01:01+01:00
Refactor eta-expansion in Prep

The Prep pass does eta-expansion but I found cases where it was
doing bad things.  So I refactored and simplified it quite a bit.

In the new design
* There is no distinction between `rhs` and `body`; in particular,
  lambdas can now appear anywhere, rather than just as the RHS of
  a let-binding.

* This change led to a significant simplification of Prep, and
  a more straightforward explanation of eta-expansion. See the new
     Note [Eta expansion]

* The consequences is that CoreToStg needs to handle naked lambdas.
  This is very easy; but it does need a unique supply, which forces
  some simple refactoring.  Having a unique supply to hand is probably
  a good thing anyway.

- - - - -
3a33983b by Simon Peyton Jones at 2026-03-31T21:01:02+01:00
Clarify Note [Interesting dictionary arguments]

Ticket #26831 ended up concluding that the code for
GHC.Core.Opt.Specialise.interestingDict was good, but the
commments were a bit inadequate.

This commit improves the comments slightly.

- - - - -
690acec9 by Simon Peyton Jones at 2026-03-31T21:01:02+01:00
Make inlining a bit more eager for overloaded functions

If we have
   f d = ... (class-op d x y) ...
we should be eager to inline `f`, because that may change the
higher order call (class-op d x y) into a call to a statically
known function.

See the discussion on #26831.

Even though this does a bit /more/ inlining, compile times
decrease by an average of 0.4%.

Compile time changes:

 DsIncompleteRecSel3(normal)     431,786,104  -2.2%
    ManyAlternatives(normal)     670,883,768  -1.6%
    ManyConstructors(normal)   3,758,493,832  -2.6% GOOD
MultilineStringsPerf(normal)      29,900,576  -2.8%
            T14052Type(ghci)   1,047,600,848  -1.2%
              T17836(normal)     392,852,328  -5.2%
              T18478(normal)     442,785,768  -1.4%
             T21839c(normal)     341,536,992 -14.1% GOOD
               T3064(normal)     174,086,152  +5.3%  BAD
               T5631(normal)     506,867,800  +1.0%
      hard_hole_fits(normal)     209,530,736  -1.3%
 info_table_map_perf(normal)  19,523,093,184  -1.2%
          parsing001(normal)     377,810,528  -1.1%
          pmcOrPats(normal)       60,075,264  -0.5%

                                 geo. mean    -0.4%
                                 minimum     -14.1%
                                 maximum      +5.3%

Runtime changes
    haddock.Cabal(normal)   27,351,988,792  -0.7%
     haddock.base(normal)   26,997,212,560  -0.6%
 haddock.compiler(normal)  219,531,332,960  -1.0%

Metric Decrease:
    ManyConstructors
    T17949
    T21839c
    T13035
    TcPlugin_RewritePerf
    hard_hole_fits

Metric Increase:
    T3064

- - - - -
c7ac6d41 by Simon Jakobi at 2026-03-31T19:20:43-04:00
Add perf test for #13960

Closes #13960.

- - - - -


20 changed files:

- compiler/GHC/Builtin/PrimOps.hs
- compiler/GHC/Core/Opt/Arity.hs
- compiler/GHC/Core/Opt/Specialise.hs
- compiler/GHC/Core/Tidy.hs
- compiler/GHC/Core/Unfold.hs
- compiler/GHC/CoreToStg.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Stg/Lint.hs
- compiler/GHC/Types/Id.hs
- compiler/GHC/Types/Id/Info.hs
- testsuite/tests/arityanal/should_compile/Arity01.stderr
- testsuite/tests/arityanal/should_compile/Arity05.stderr
- testsuite/tests/arityanal/should_compile/Arity08.stderr
- testsuite/tests/arityanal/should_compile/Arity11.stderr
- testsuite/tests/arityanal/should_compile/Arity14.stderr
- + testsuite/tests/perf/compiler/T13960.hs
- testsuite/tests/perf/compiler/all.T
- testsuite/tests/simplCore/should_compile/T15205.stderr
- testsuite/tests/wasm/should_run/control-flow/LoadCmmGroup.hs


Changes:

=====================================
compiler/GHC/Builtin/PrimOps.hs
=====================================
@@ -807,16 +807,23 @@ the former has an additional type binder. Hmmm....
 
 Note [Eta expanding primops]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
 STG requires that primop applications be saturated. This makes code generation
 significantly simpler since otherwise we would need to define a calling
 convention for curried applications that can accommodate representation
 polymorphism.
 
-To ensure saturation, CorePrep eta expands all primop applications as
-described in Note [Eta expansion of hasNoBinding things in CorePrep] in
+To ensure saturation, CorePrep eta expands all primop applications
+as described in Note [Eta expansion of unsaturated calls] in
 GHC.Core.Prep.
 
+Side note: this decision is somewhat in flux: see comments with `hasNoBinding`.
+The question is: do we generate a trivial wrapper for each primop
+   (+#) x y = (+#) x y
+and now we can call that wrapper unsaturated.  But in practice we
+might never call it because in practice Prep eta-expands all partial
+applications!
+
+
 Historical Note:
 
 For a short period around GHC 8.8 we rewrote unsaturated primop applications to


=====================================
compiler/GHC/Core/Opt/Arity.hs
=====================================
@@ -2551,9 +2551,6 @@ This reduces clutter, sometimes a lot. See Note [Do not eta-expand PAPs]
 in GHC.Core.Opt.Simplify.Utils, where we are careful not to eta-expand
 a PAP.  If eta-expanding is bad, then eta-reducing is good!
 
-Also the code generator likes eta-reduced PAPs; see GHC.CoreToStg.Prep
-Note [No eta reduction needed in rhsToBody].
-
 But note that we don't want to eta-reduce
      \x y.  f <expensive> x y
 to


=====================================
compiler/GHC/Core/Opt/Specialise.hs
=====================================
@@ -3247,9 +3247,14 @@ case we can clearly specialise. But there are wrinkles:
 
 (ID6) The Main Plan says that it's worth specialising if the argument is an application
    of a dictionary contructor.  But what if the dictionary has no methods?  Then we
-   gain nothing by specialising, unless the /superclasses/ are interesting.   A case
-   in point is constraint tuples (% d1, .., dn %); a constraint N-tuple is a class
-   with N superclasses and no methods.
+   gain nothing by specialising, unless the /superclasses/ are interesting.
+
+   So if there are no methods, we recursively call `interestingDict` on the
+   superclasses.  Why recurse? If we have
+         \d1 d2.  f (CTuple d1 d2)
+   If `d1 and `d2` are uninteresting dictionaries, then so is (CTuple d1 d2).
+   (Remember: a constraint tuple is just a class with N superclasses and no methods.)
+   See discussion on #26831.
 
 (ID7) A unary (single-method) class is currently represented by (meth |> co).  We
    will unwrap the cast (see (ID5)) and then want to reply "yes" if the method


=====================================
compiler/GHC/Core/Tidy.hs
=====================================
@@ -165,6 +165,7 @@ computeCbvInfo fun_id rhs
                 map mkMark val_args
 
     cbv_bndr | any isMarkedCbv cbv_marks
+               -- isMarkedCbv: see (CBV2) in Note [CBV Function Ids: overview]
              = cbv_marks `seqList` setIdCbvMarks fun_id cbv_marks
                -- seqList: avoid retaining the original rhs
 
@@ -176,6 +177,7 @@ computeCbvInfo fun_id rhs
     -- We don't set CBV marks on functions which take unboxed tuples or sums as
     -- arguments.  Doing so would require us to compute the result of unarise
     -- here in order to properly determine argument positions at runtime.
+    -- See (CBV1) in Note [CBV Function Ids: overview]
     --
     -- In practice this doesn't matter much. Most "interesting" functions will
     -- get a W/W split which will eliminate unboxed tuple arguments, and unboxed


=====================================
compiler/GHC/Core/Unfold.hs
=====================================
@@ -779,22 +779,28 @@ litSize _other = 0    -- Must match size of nullary constructors
 
 classOpSize :: UnfoldingOpts -> Class -> [Id] -> [CoreExpr] -> ExprSize
 -- See (IA1) in Note [Interesting arguments] in GHC.Core.Opt.Simplify.Utils
-classOpSize opts cls top_args args
-  | isUnaryClass cls
-  = sizeZero   -- See (UCM4) in Note [Unary class magic] in GHC.Core.TyCon
-  | otherwise
-  = case args of
-       []                -> sizeZero
-       (arg1:other_args) -> SizeIs (size other_args) (arg_discount arg1) 0
+classOpSize _opts _cls _top_args []
+  = sizeZero   -- A non-applied classop
+classOpSize opts cls top_args (dict_arg:other_val_args)
+  = SizeIs size (arg_discount dict_arg) 0
   where
-    size other_args = 20 + (10 * length other_args)
+    size | isUnaryClass cls = 0    -- See (UCM4) in Note [Unary class magic] in GHC.Core.TyCon
+         | otherwise        = 20 + (10 * length other_val_args)
 
     -- If the class op is scrutinising a lambda bound dictionary then
     -- give it a discount, to encourage the inlining of this function
-    -- The actual discount is rather arbitrarily chosen
-    arg_discount (Var dict) | dict `elem` top_args
-                   = unitBag (dict, unfoldingDictDiscount opts)
-    arg_discount _ = emptyBag
+    arg_discount (Cast arg _co)                    = arg_discount arg
+    arg_discount (Var dict) | dict `elem` top_args = unitBag (dict, dict_discount)
+    arg_discount _                                 = emptyBag
+
+    -- If we have (class-op d arg1 .. argn) then it's super-good to inline
+    -- to expose `d`; not only can we do the dictionary selection
+    -- (class-op d), but that will likely expose a lambda which we can then
+    -- apply.  In that case (n > 0), we add `unfoldingFunAppDiscount`.
+    -- See the discussion on #26831, esp "Delicate inlining".
+    dict_discount
+      | null other_val_args = unfoldingDictDiscount opts
+      | otherwise           = unfoldingDictDiscount opts + unfoldingFunAppDiscount opts
 
 -- | The size of a function call
 callSize


=====================================
compiler/GHC/CoreToStg.hs
=====================================
@@ -39,6 +39,8 @@ import GHC.Types.Basic  ( Arity, TypeOrConstraint(..) )
 import GHC.Types.Literal
 import GHC.Types.ForeignCall
 import GHC.Types.IPE
+import GHC.Types.Unique.Supply
+import GHC.Types.Unique
 
 import GHC.Unit.Module
 import GHC.Platform        ( Platform )
@@ -49,297 +51,309 @@ import GHC.Utils.Outputable
 import GHC.Utils.Monad
 import GHC.Utils.Misc (HasDebugCallStack)
 import GHC.Utils.Panic
+import GHC.Data.FastString
 
 import Control.Monad (ap)

--- Note [Live vs free]
--- ~~~~~~~~~~~~~~~~~~~
---
--- The two are not the same. Liveness is an operational property rather
--- than a semantic one. A variable is live at a particular execution
--- point if it can be referred to directly again. In particular, a dead
--- variable's stack slot (if it has one):
---
---           - should be stubbed to avoid space leaks, and
---           - may be reused for something else.
---
--- There ought to be a better way to say this. Here are some examples:
---
---         let v = [q] \[x] -> e
---         in
---         ...v...  (but no q's)
---
--- Just after the `in', v is live, but q is dead. If the whole of that
--- let expression was enclosed in a case expression, thus:
---
---         case (let v = [q] \[x] -> e in ...v...) of
---                 alts[...q...]
---
--- (ie `alts' mention `q'), then `q' is live even after the `in'; because
--- we'll return later to the `alts' and need it.
---
--- Let-no-escapes make this a bit more interesting:
---
---         let-no-escape v = [q] \ [x] -> e
---         in
---         ...v...
---
--- Here, `q' is still live at the `in', because `v' is represented not by
--- a closure but by the current stack state.  In other words, if `v' is
--- live then so is `q'. Furthermore, if `e' mentions an enclosing
--- let-no-escaped variable, then its free variables are also live if `v' is.
+{- Note [Live vs free]
+~~~~~~~~~~~~~~~~~~~~~~
+The two are not the same. Liveness is an operational property rather
+than a semantic one. A variable is live at a particular execution
+point if it can be referred to directly again. In particular, a dead
+variable's stack slot (if it has one):
 
--- Note [What are these SRTs all about?]
--- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---
--- Consider the Core program,
---
---     fibs = go 1 1
---       where go a b = let c = a + c
---                      in c : go b c
---     add x = map (\y -> x*y) fibs
---
--- In this case we have a CAF, 'fibs', which is quite large after evaluation and
--- has only one possible user, 'add'. Consequently, we want to ensure that when
--- all references to 'add' die we can garbage collect any bit of 'fibs' that we
--- have evaluated.
---
--- However, how do we know whether there are any references to 'fibs' still
--- around? Afterall, the only reference to it is buried in the code generated
--- for 'add'. The answer is that we record the CAFs referred to by a definition
--- in its info table, namely a part of it known as the Static Reference Table
--- (SRT).
---
--- Since SRTs are so common, we use a special compact encoding for them in: we
--- produce one table containing a list of CAFs in a module and then include a
--- bitmap in each info table describing which entries of this table the closure
--- references.
---
--- See also: commentary/rts/storage/gc/CAFs on the GHC Wiki.
+          - should be stubbed to avoid space leaks, and
+          - may be reused for something else.
 
--- Note [What is a non-escaping let]
--- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---
--- NB: Nowadays this is recognized by the occurrence analyser by turning a
--- "non-escaping let" into a join point. The following is then an operational
--- account of join points.
---
--- Consider:
---
---     let x = fvs \ args -> e
---     in
---         if ... then x else
---            if ... then x else ...
---
--- `x' is used twice (so we probably can't unfold it), but when it is
--- entered, the stack is deeper than it was when the definition of `x'
--- happened.  Specifically, if instead of allocating a closure for `x',
--- we saved all `x's fvs on the stack, and remembered the stack depth at
--- that moment, then whenever we enter `x' we can simply set the stack
--- pointer(s) to these remembered (compile-time-fixed) values, and jump
--- to the code for `x'.
---
--- All of this is provided x is:
---   1. non-updatable;
---   2. guaranteed to be entered before the stack retreats -- ie x is not
---      buried in a heap-allocated closure, or passed as an argument to
---      something;
---   3. all the enters have exactly the right number of arguments,
---      no more no less;
---   4. all the enters are tail calls; that is, they return to the
---      caller enclosing the definition of `x'.
---
--- Under these circumstances we say that `x' is non-escaping.
---
--- An example of when (4) does not hold:
---
---     let x = ...
---     in case x of ...alts...
---
--- Here, `x' is certainly entered only when the stack is deeper than when
--- `x' is defined, but here it must return to ...alts... So we can't just
--- adjust the stack down to `x''s recalled points, because that would lost
--- alts' context.
---
--- Things can get a little more complicated.  Consider:
---
---     let y = ...
---     in let x = fvs \ args -> ...y...
---     in ...x...
---
--- Now, if `x' is used in a non-escaping way in ...x..., and `y' is used in a
--- non-escaping way in ...y..., then `y' is non-escaping.
---
--- `x' can even be recursive!  Eg:
---
---     letrec x = [y] \ [v] -> if v then x True else ...
---     in
---         ...(x b)...
+There ought to be a better way to say this. Here are some examples:
 
--- Note [Cost-centre initialization plan]
--- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---
--- Previously `coreToStg` was initializing cost-centre stack fields as `noCCS`,
--- and the fields were then fixed by a separate pass `stgMassageForProfiling`.
--- We now initialize these correctly. The initialization works like this:
---
---   - For non-top level bindings always use `currentCCS`.
---
---   - For top-level bindings, check if the binding is a CAF
---
---     - CAF:      If -fcaf-all is enabled, create a new CAF just for this CAF
---                 and use it. Note that these new cost centres need to be
---                 collected to be able to generate cost centre initialization
---                 code, so `coreToTopStgRhs` now returns `CollectedCCs`.
---
---                 If -fcaf-all is not enabled, use "all CAFs" cost centre.
---
---     - Non-CAF:  Top-level (static) data is not counted in heap profiles; nor
---                 do we set CCCS from it; so we just slam in
---                 dontCareCostCentre.
-
--- Note [Coercion tokens]
--- ~~~~~~~~~~~~~~~~~~~~~~
--- In coreToStgArgs, we drop type arguments completely, but we replace
--- coercions with a special coercionToken# placeholder. Why? Consider:
---
---   f :: forall a. Int ~# Bool -> a
---   f = /\a. \(co :: Int ~# Bool) -> error "impossible"
---
--- If we erased the coercion argument completely, we’d end up with just
--- f = error "impossible", but then f `seq` () would be ⊥!
---
--- This is an artificial example, but back in the day we *did* treat
--- coercion lambdas like type lambdas, and we had bug reports as a
--- result. So now we treat coercion lambdas like value lambdas, but we
--- treat coercions themselves as zero-width arguments — coercionToken#
--- has representation VoidRep — which gets the best of both worlds.
---
--- (For the gory details, see also the (unpublished) paper, “Practical
--- aspects of evidence-based compilation in System FC.”)
+        let v = [q] \[x] -> e
+        in
+        ...v...  (but no q's)
 
--- Note [Saturation of data constructors in STG]
--- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--- We guarantee that `StgConApp` is an exactly-saturated application of a data
--- constructor worker.
---
--- * If the data constructor is /under/-saturated we just fall through to build
---   a `StgApp`.  Remember, data constructor workers have a regular top-level definition
---   (injected by GHC.CoreToStg.Prep.mkDataConWorkers) so we can partially apply
---   that function.
---
--- * If the data constructor is /over/-saturated, which can happen (see #23865) we again
---   fall through to `StgApp`.  That will fail horribly at runtime (by applying data
---   constructor to an argument) but it should be in dead code, and at least the compiler
---   itself won't crash.  (We could inject an error-thunk instead.)
+Just after the `in', v is live, but q is dead. If the whole of that
+let expression was enclosed in a case expression, thus:
+
+        case (let v = [q] \[x] -> e in ...v...) of
+                alts[...q...]
+
+(ie `alts' mention `q'), then `q' is live even after the `in'; because
+we'll return later to the `alts' and need it.
+
+Let-no-escapes make this a bit more interesting:
+
+        let-no-escape v = [q] \ [x] -> e
+        in
+        ...v...
+
+Here, `q' is still live at the `in', because `v' is represented not by
+a closure but by the current stack state.  In other words, if `v' is
+live then so is `q'. Furthermore, if `e' mentions an enclosing
+let-no-escaped variable, then its free variables are also live if `v' is.
+
+Note [What are these SRTs all about?]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider the Core program,
+
+    fibs = go 1 1
+      where go a b = let c = a + c
+                     in c : go b c
+    add x = map (\y -> x*y) fibs
+
+In this case we have a CAF, 'fibs', which is quite large after evaluation and
+has only one possible user, 'add'. Consequently, we want to ensure that when
+all references to 'add' die we can garbage collect any bit of 'fibs' that we
+have evaluated.
+
+However, how do we know whether there are any references to 'fibs' still
+around? Afterall, the only reference to it is buried in the code generated
+for 'add'. The answer is that we record the CAFs referred to by a definition
+in its info table, namely a part of it known as the Static Reference Table
+(SRT).
 
+Since SRTs are so common, we use a special compact encoding for them in: we
+produce one table containing a list of CAFs in a module and then include a
+bitmap in each info table describing which entries of this table the closure
+references.
+
+See also: commentary/rts/storage/gc/CAFs on the GHC Wiki.
+
+Note [What is a non-escaping let]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+NB: Nowadays this is recognized by the occurrence analyser by turning a
+"non-escaping let" into a join point. The following is then an operational
+account of join points.
+
+Consider:
+
+    let x = fvs \ args -> e
+    in
+        if ... then x else
+           if ... then x else ...
+
+`x' is used twice (so we probably can't unfold it), but when it is
+entered, the stack is deeper than it was when the definition of `x'
+happened.  Specifically, if instead of allocating a closure for `x',
+we saved all `x's fvs on the stack, and remembered the stack depth at
+that moment, then whenever we enter `x' we can simply set the stack
+pointer(s) to these remembered (compile-time-fixed) values, and jump
+to the code for `x'.
+
+All of this is provided x is:
+  1. non-updatable;
+  2. guaranteed to be entered before the stack retreats -- ie x is not
+     buried in a heap-allocated closure, or passed as an argument to
+     something;
+  3. all the enters have exactly the right number of arguments,
+     no more no less;
+  4. all the enters are tail calls; that is, they return to the
+     caller enclosing the definition of `x'.
+
+Under these circumstances we say that `x' is non-escaping.
+
+An example of when (4) does not hold:
+
+    let x = ...
+    in case x of ...alts...
+
+Here, `x' is certainly entered only when the stack is deeper than when
+`x' is defined, but here it must return to ...alts... So we can't just
+adjust the stack down to `x''s recalled points, because that would lost
+alts' context.
+
+Things can get a little more complicated.  Consider:
+
+    let y = ...
+    in let x = fvs \ args -> ...y...
+    in ...x...
+
+Now, if `x' is used in a non-escaping way in ...x..., and `y' is used in a
+non-escaping way in ...y..., then `y' is non-escaping.
+
+`x' can even be recursive!  Eg:
+
+    letrec x = [y] \ [v] -> if v then x True else ...
+    in
+        ...(x b)...
+
+Note [Cost-centre initialization plan]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Previously `coreToStg` was initializing cost-centre stack fields as `noCCS`,
+and the fields were then fixed by a separate pass `stgMassageForProfiling`.
+We now initialize these correctly. The initialization works like this:
+
+  - For non-top level bindings always use `currentCCS`.
+
+  - For top-level bindings, check if the binding is a CAF
+
+    - CAF:      If -fcaf-all is enabled, create a new CAF just for this CAF
+                and use it. Note that these new cost centres need to be
+                collected to be able to generate cost centre initialization
+                code, so `coreToTopStgRhs` now returns `CollectedCCs`.
+
+                If -fcaf-all is not enabled, use "all CAFs" cost centre.
+
+    - Non-CAF:  Top-level (static) data is not counted in heap profiles; nor
+                do we set CCCS from it; so we just slam in
+                dontCareCostCentre.
+
+Note [Coercion tokens]
+~~~~~~~~~~~~~~~~~~~~~~
+In coreToStgArgs, we drop type arguments completely, but we replace
+coercions with a special coercionToken# placeholder. Why? Consider:
+
+  f :: forall a. Int ~# Bool -> a
+  f = /\a. \(co :: Int ~# Bool) -> error "impossible"
+
+If we erased the coercion argument completely, we’d end up with just
+f = error "impossible", but then f `seq` () would be ⊥!
+
+This is an artificial example, but back in the day we *did* treat
+coercion lambdas like type lambdas, and we had bug reports as a
+result. So now we treat coercion lambdas like value lambdas, but we
+treat coercions themselves as zero-width arguments — coercionToken#
+has representation VoidRep — which gets the best of both worlds.
+
+(For the gory details, see also the (unpublished) paper, “Practical
+aspects of evidence-based compilation in System FC.”)
+
+Note [Saturation of data constructors in STG]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We guarantee that `StgConApp` is an exactly-saturated application of a data
+constructor worker.
+
+* If the data constructor is /under/-saturated we just fall through to build
+  a `StgApp`.  Remember, data constructor workers have a regular top-level definition
+  (injected by GHC.CoreToStg.Prep.mkDataConWorkers) so we can partially apply
+  that function.
+
+* If the data constructor is /over/-saturated, which can happen (see #23865) we again
+  fall through to `StgApp`.  That will fail horribly at runtime (by applying data
+  constructor to an argument) but it should be in dead code, and at least the compiler
+  itself won't crash.  (We could inject an error-thunk instead.)
+
+Note [Naked lambdas in coreToStgExpr]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider
+  f x = case x of
+           True  -> \y. y+x
+           False -> blah
+If `f` is not eta expanded (which would have happened in Prep if it was
+going to happen at all, the code for f must allocate a closure for the
+(\y. y+x).  So the STG code we want has
+
+     True -> let pap = \y. y+x
+             in pap
+
+The Lam case of `coreToStgExpr` deals with adding this `StgLet`. It's the
+main reason we need a unique supply in the monad.
+
+Historical note: in the past, Prep guaranteed there would be no such naked
+lambdas, so we didn't need a unique supply at all. But that proved too hard
+in the end (see Note [Eta expansion and the CorePrep invariants]) so we
+just deal with it here; it's very easy.
+-}
 
 -- --------------------------------------------------------------
 -- Setting variable info: top-level, binds, RHSs
 -- --------------------------------------------------------------
 
 
-coreToStg :: CoreToStgOpts -> Module -> ModLocation -> CoreProgram
-          -> ([StgTopBinding], InfoTableProvMap, CollectedCCs)
-coreToStg opts@CoreToStgOpts
-  { coreToStg_ways = ways
-  , coreToStg_AutoSccsOnIndividualCafs = opt_AutoSccsOnIndividualCafs
-  , coreToStg_InfoTableMap = opt_InfoTableMap
-  , coreToStg_stgDebugOpts = stgDebugOpts
-  } this_mod ml pgm
-  = (pgm'', denv, final_ccs)
+coreToStg :: CoreToStgOpts -> Module -> ModLocation
+          -> CoreProgram
+          -> IO ([StgTopBinding], InfoTableProvMap, CollectedCCs)
+coreToStg opts this_mod ml pgm
+  = do { us <- mkSplitUniqSupply StgTag
+       ; let (_, (local_ccs, local_cc_stacks), pgm')
+                = initCts opts us $
+                  coreTopBindsToStg opts this_mod emptyCollectedCCs pgm
+
+             -- See Note [Mapping Info Tables to Source Positions]
+             (!pgm'', !denv)
+               | opt_InfoTableMap
+               = collectDebugInformation stgDebugOpts ml pgm'
+               | otherwise = (pgm', emptyInfoTableProvMap)
+
+             final_ccs
+               | prof && opt_AutoSccsOnIndividualCafs
+               = (local_ccs,local_cc_stacks)  -- don't need "all CAFs" CC
+               | prof
+               = (all_cafs_cc:local_ccs, all_cafs_ccs:local_cc_stacks)
+               | otherwise
+               = emptyCollectedCCs
+
+      ; return (pgm'', denv, final_ccs) }
   where
-    (_, (local_ccs, local_cc_stacks), pgm')
-      = coreTopBindsToStg opts this_mod emptyVarEnv emptyCollectedCCs pgm
-
-    -- See Note [Mapping Info Tables to Source Positions]
-    (!pgm'', !denv)
-      | opt_InfoTableMap
-      = collectDebugInformation stgDebugOpts ml pgm'
-      | otherwise = (pgm', emptyInfoTableProvMap)
+    CoreToStgOpts { coreToStg_ways = ways
+                  , coreToStg_AutoSccsOnIndividualCafs = opt_AutoSccsOnIndividualCafs
+                  , coreToStg_InfoTableMap = opt_InfoTableMap
+                  , coreToStg_stgDebugOpts = stgDebugOpts }
+       = opts
 
     prof = hasWay ways WayProf
-
-    final_ccs
-      | prof && opt_AutoSccsOnIndividualCafs
-      = (local_ccs,local_cc_stacks)  -- don't need "all CAFs" CC
-      | prof
-      = (all_cafs_cc:local_ccs, all_cafs_ccs:local_cc_stacks)
-      | otherwise
-      = emptyCollectedCCs
-
     (all_cafs_cc, all_cafs_ccs) = getAllCAFsCC this_mod
 
 coreTopBindsToStg
     :: CoreToStgOpts
     -> Module
-    -> IdEnv HowBound           -- environment for the bindings
     -> CollectedCCs
     -> CoreProgram
-    -> (IdEnv HowBound, CollectedCCs, [StgTopBinding])
+    -> CtsM (IdEnv HowBound, CollectedCCs, [StgTopBinding])
+
+coreTopBindsToStg _ _ ccs []
+  = do { env <- getCtsEnv
+       ; return (env, ccs, []) }
 
-coreTopBindsToStg _      _        env ccs []
-  = (env, ccs, [])
-coreTopBindsToStg opts this_mod env ccs (b:bs)
+coreTopBindsToStg opts this_mod ccs (b:bs)
   | NonRec _ rhs <- b, isTyCoArg rhs
-  = coreTopBindsToStg opts this_mod env1 ccs1 bs
+  = coreTopBindsToStg opts this_mod ccs bs
   | otherwise
-  = (env2, ccs2, b':bs')
-  where
-    (env1, ccs1, b' ) = coreTopBindToStg opts this_mod env ccs b
-    (env2, ccs2, bs') = coreTopBindsToStg opts this_mod env1 ccs1 bs
+  = do { (env1, ccs1, b' ) <- coreTopBindToStg opts this_mod ccs b
+       ; (env2, ccs2, bs') <- setCtsEnv env1 $
+                              coreTopBindsToStg opts this_mod ccs1 bs
+      ; return (env2, ccs2, b':bs') }
 
 coreTopBindToStg
         :: CoreToStgOpts
         -> Module
-        -> IdEnv HowBound
         -> CollectedCCs
         -> CoreBind
-        -> (IdEnv HowBound, CollectedCCs, StgTopBinding)
+        -> CtsM (IdEnv HowBound, CollectedCCs, StgTopBinding)
 
-coreTopBindToStg _ _ env ccs (NonRec id e)
+coreTopBindToStg _ _ ccs (NonRec id e)
   | Just str <- exprIsTickedString_maybe e
   -- top-level string literal
   -- See Note [Core top-level string literals] in GHC.Core
-  = let
-        env' = extendVarEnv env id how_bound
-        how_bound = LetBound TopLet 0
-    in (env', ccs, StgTopStringLit id str)
-
-coreTopBindToStg opts@CoreToStgOpts
-  { coreToStg_platform = platform
-  } this_mod env ccs (NonRec id rhs)
-  = let
-        env'      = extendVarEnv env id how_bound
-        how_bound = LetBound TopLet $! manifestArity rhs
-
-        (ccs', (id', stg_rhs)) =
-            initCts platform env $
-              coreToTopStgRhs opts this_mod ccs (id,rhs)
-
-        bind = StgTopLifted $ StgNonRec id' stg_rhs
-    in
-      -- NB: previously the assertion printed 'rhs' and 'bind'
-      --     as well as 'id', but that led to a black hole
-      --     where printing the assertion error tripped the
-      --     assertion again!
-    (env', ccs', bind)
-
-coreTopBindToStg opts@CoreToStgOpts
-  { coreToStg_platform = platform
-  } this_mod env ccs (Rec pairs)
+  = do { env <- getCtsEnv
+       ; let env' = extendVarEnv env id how_bound
+             how_bound = LetBound TopLet 0
+       ; return (env', ccs, StgTopStringLit id str) }
+
+coreTopBindToStg opts this_mod ccs (NonRec id rhs)
+  = do { (ccs', (id', stg_rhs)) <- coreToTopStgRhs opts this_mod ccs (id,rhs)
+
+       ; env <- getCtsEnv
+       ; let env'      = extendVarEnv env id how_bound
+             how_bound = LetBound TopLet $! manifestArity rhs
+             bind      = StgTopLifted $ StgNonRec id' stg_rhs
+       ; return (env', ccs', bind) }
+
+coreTopBindToStg opts this_mod ccs (Rec pairs)
   = assert (not (null pairs)) $
-    let
-        extra_env' = [ (b, LetBound TopLet $! manifestArity rhs)
-                     | (b, rhs) <- pairs ]
-        env' = extendVarEnvList env extra_env'
-
-        -- generate StgTopBindings and CAF cost centres created for CAFs
-        (ccs', stg_rhss)
-          = initCts platform env' $ mapAccumLM (coreToTopStgRhs opts this_mod) ccs pairs
-        bind = StgTopLifted $ StgRec stg_rhss
-    in
-    (env', ccs', bind)
+    do { env <- getCtsEnv
+       ; let extra_env' = [ (b, LetBound TopLet $! manifestArity rhs)
+                          | (b, rhs) <- pairs ]
+             env' = extendVarEnvList env extra_env'
+
+       -- Generate StgTopBindings and CAF cost centres created for CAFs
+       ; (ccs', stg_rhss) <- setCtsEnv env' $
+                             mapAccumLM (coreToTopStgRhs opts this_mod) ccs pairs
+       ; let bind = StgTopLifted $ StgRec stg_rhss
+
+       ; return (env', ccs', bind) }
 
 coreToTopStgRhs
         :: CoreToStgOpts
@@ -420,16 +434,24 @@ coreToStgExpr expr@(App _ _)
       res_ty                  = exprType expr
       (app_head, args, ticks) = myCollectArgs expr res_ty
 
-coreToStgExpr expr@(Lam _ _)
-  = let
-        (args, body) = myCollectBinders expr
-    in
-    case filterStgBinders args of
-
-      [] -> coreToStgExpr body
-
-      _ -> pprPanic "coretoStgExpr" $
-        text "Unexpected value lambda:" $$ ppr expr
+coreToStgExpr expr@(Lam {})
+  | null val_bndrs
+  = coreToStgExpr body
+  | otherwise
+  = -- See Note [Naked lambdas in coreToStgExpr]
+    do { body' <- extendVarEnvCts [ (a, LambdaBound) | a <- val_bndrs ] $
+                  coreToStgExpr body
+       ; uniq <- getCtsUnique
+       ; let body_ty = exprType body
+             fun_ty  = mkLamTypes val_bndrs body_ty
+                       -- This type is a bit ill-formed but it doesn't matter
+             rhs = StgRhsClosure noExtFieldSilent currentCCS
+                                 ReEntrant val_bndrs body' body_ty
+             tmp_fun = mkSysLocal (fsLit "pap") uniq ManyTy fun_ty
+       ; return (StgLet noExtFieldSilent (StgNonRec tmp_fun rhs) $
+                 StgApp tmp_fun []) }
+  where
+    (val_bndrs, body) = myCollectBinders NotJoinPoint expr
 
 coreToStgExpr (Tick tick expr)
   = do
@@ -634,8 +656,13 @@ coreToStgArgs (arg : args) = do         -- Non-type argument
         stg_arg_rep = stgArgRep arg'
         bad_args = not (primRepsCompatible platform arg_rep stg_arg_rep)
 
-    massertPpr (length ticks' <= 1) (text "More than one Tick in trivial arg:" <+> ppr arg)
-    warnPprTraceM bad_args "Dangerous-looking argument. Probable cause: bad unsafeCoerce#" (ppr arg)
+    -- Yikes!  This assert FAILS in tests T13658, T14779b
+    -- It has been so for ages, but without the "() <-" it was lazily dropped
+    -- Hence commenting it out: see #27132
+    --    massertPpr (length ticks' <= 1) (text "More than one Tick in trivial arg:" <+> ppr arg)
+
+    () <- warnPprTraceM bad_args
+            "Dangerous-looking argument. Probable cause: bad unsafeCoerce#" (ppr arg)
 
     return (arg' : stg_args, ticks' ++ ticks)
 
@@ -710,12 +737,11 @@ coreToStgRhs (bndr, rhs) = do
 -- coreToStgExpr that can handle value lambdas.
 coreToMkStgRhs :: HasDebugCallStack => Id -> CoreExpr -> CtsM MkStgRhs
 coreToMkStgRhs bndr expr = do
-  let (args, body) = myCollectBinders expr
-  let args'        = filterStgBinders args
-  extendVarEnvCts [ (a, LambdaBound) | a <- args' ] $ do
+  let (bndrs, body) = myCollectBinders (idJoinPointHood bndr) expr
+  extendVarEnvCts [ (a, LambdaBound) | a <- bndrs ] $ do
     body' <- coreToStgExpr body
     let mk_rhs = MkStgRhs
-          { rhs_args = args'
+          { rhs_args = bndrs
           , rhs_expr = body'
           , rhs_type = exprType body
           , rhs_is_join = isJoinId bndr
@@ -733,7 +759,7 @@ coreToMkStgRhs bndr expr = do
 newtype CtsM a = CtsM
     { unCtsM :: Platform -- Needed for checking for bad coercions in coreToStgArgs
              -> IdEnv HowBound
-             -> a
+             -> UniqSM a
     }
     deriving (Functor)
 
@@ -769,20 +795,22 @@ data LetInfo
 
 -- The std monad functions:
 
-initCts :: Platform -> IdEnv HowBound -> CtsM a -> a
-initCts platform env m = unCtsM m platform env
-
+initCts :: CoreToStgOpts -> UniqSupply -> CtsM a -> a
+initCts opts us cts_m
+  = initUs_ us $
+    unCtsM cts_m (coreToStg_platform opts) emptyVarEnv
 
 
 {-# INLINE thenCts #-}
 {-# INLINE returnCts #-}
 
 returnCts :: a -> CtsM a
-returnCts e = CtsM $ \_ _ -> e
+returnCts e = CtsM $ \_ _ -> return e
 
 thenCts :: CtsM a -> (a -> CtsM b) -> CtsM b
-thenCts m k = CtsM $ \platform env
-  -> unCtsM (k (unCtsM m platform env)) platform env
+thenCts m k = CtsM $ \platform env ->
+              do { v <- unCtsM m platform env
+                 ; unCtsM (k v) platform env }
 
 instance Applicative CtsM where
     pure = returnCts
@@ -792,17 +820,26 @@ instance Monad CtsM where
     (>>=)  = thenCts
 
 getPlatform :: CtsM Platform
-getPlatform = CtsM const
+getPlatform = CtsM $ \platform _ -> return platform
 
 -- Functions specific to this monad:
 
+setCtsEnv :: IdEnv HowBound -> CtsM a -> CtsM a
+setCtsEnv env thing = CtsM $ \platform _ -> unCtsM thing platform env
+
+getCtsEnv :: CtsM (IdEnv HowBound)
+getCtsEnv = CtsM $ \_ env -> return env
+
+getCtsUnique :: CtsM Unique
+getCtsUnique = CtsM $ \_ _ -> getUniqueM
+
 extendVarEnvCts :: [(Id, HowBound)] -> CtsM a -> CtsM a
 extendVarEnvCts ids_w_howbound expr
    =    CtsM $   \platform env
    -> unCtsM expr platform (extendVarEnvList env ids_w_howbound)
 
 lookupVarCts :: Id -> CtsM HowBound
-lookupVarCts v = CtsM $ \_ env -> lookupBinding env v
+lookupVarCts v = CtsM $ \_ env -> return (lookupBinding env v)
 
 lookupBinding :: IdEnv HowBound -> Id -> HowBound
 lookupBinding env v = case lookupVarEnv env v of
@@ -814,13 +851,26 @@ lookupBinding env v = case lookupVarEnv env v of
 filterStgBinders :: [Var] -> [Var]
 filterStgBinders bndrs = filter isId bndrs
 
-myCollectBinders :: Expr Var -> ([Var], Expr Var)
-myCollectBinders expr
+myCollectBinders :: JoinPointHood -> Expr Var -> ([Var], Expr Var)
+-- Collect the binders from a lambda:
+--   * Dropping type lambdas
+--   * Stopping at join-point arity
+myCollectBinders NotJoinPoint expr
   = go [] expr
   where
-    go bs (Lam b e)          = go (b:bs) e
-    go bs (Cast e _)         = go bs e
-    go bs e                  = (reverse bs, e)
+    go bs (Lam b e) | isRuntimeVar b = go (b:bs) e
+                    | otherwise      = go bs     e
+    go bs (Cast e _)                 = go bs e
+    go bs e                          = (reverse bs, e)
+
+myCollectBinders (JoinPoint n) expr
+  = go n [] expr
+  where
+    go n bs e | n==0                   = (reverse bs, e)
+    go n bs (Lam b e) | isRuntimeVar b = go (n-1) (b:bs) e
+                      | otherwise      = go (n-1) bs     e
+    go n bs (Cast e _)                 = go n bs e
+    go _ bs e                          = (reverse bs, e)
 
 -- | If the argument expression is (potential chain of) 'App', return the head
 -- of the app chain, and collect ticks/args along the chain.


=====================================
compiler/GHC/CoreToStg/Prep.hs
=====================================
@@ -144,16 +144,13 @@ Here is the syntax of the Core produced by CorePrep:
 
     Expressions
        body ::= app
-             |  let(rec) x = rhs in body     -- Boxed only
+             |  let(rec) x = body in body     -- Boxed only
              |  case body of pat -> body
-             |  /\a. body | /\c. body
+             |  /\a. body | /\c. body | \x. body
              |  body |> co
 
-    Right hand sides (only place where value lambdas can occur)
-       rhs ::= /\a.rhs  |  \x.rhs  |  body
-
-We define a synonym for each of these non-terminals.  Functions
-with the corresponding name produce a result in that syntax.
+We define a synonym for each of these non-terminals, CpeArg, CpeApp, and
+CpeBody.  Functions with the corresponding name produce a result in that syntax.
 
 Note [Cloning in CorePrep]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -218,7 +215,6 @@ So our plan is:
 type CpeArg  = CoreExpr    -- Non-terminal 'arg'
 type CpeApp  = CoreExpr    -- Non-terminal 'app'
 type CpeBody = CoreExpr    -- Non-terminal 'body'
-type CpeRhs  = CoreExpr    -- Non-terminal 'rhs'
 
 {-
 ************************************************************************
@@ -261,7 +257,7 @@ corePrepExpr logger config expr = do
     withTiming logger (text "CorePrep [expr]") (\e -> e `seq` ()) $ do
       us <- mkSplitUniqSupply StgTag
       let initialCorePrepEnv = mkInitialCorePrepEnv config
-      let new_expr = initUs_ us (cpeBodyNF initialCorePrepEnv expr)
+      let new_expr = initUs_ us (cpeBody initialCorePrepEnv expr)
       putDumpFileMaybe logger Opt_D_dump_prep "CorePrep" FormatCore (ppr new_expr)
       return new_expr
 
@@ -665,16 +661,16 @@ cpeBind top_lvl env (Rec pairs)
 ---------------
 cpePair :: TopLevelFlag -> RecFlag -> Demand -> Levity
         -> CorePrepEnv -> OutId -> CoreExpr
-        -> UniqSM (Floats, CpeRhs)
+        -> UniqSM (Floats, CpeBody)
 -- Used for all bindings
 -- The binder is already cloned, hence an OutId
 cpePair top_lvl is_rec dmd lev env0 bndr rhs
   = assert (isNothing $ joinPointBinding_maybe bndr rhs) $ -- those should use cpeJoinPair
-    do { (floats1, rhs1) <- cpeRhsE env rhs
+    do { (floats1, rhs1) <- cpeBodyF env rhs
 
        -- See if we are allowed to float this stuff out of the RHS
        ; let dec = want_float_from_rhs floats1 rhs1
-       ; (floats2, rhs2) <- executeFloatDecision env dec floats1 rhs1
+             (floats2, rhs2) = executeFloatDecision dec floats1 rhs1
 
        -- Make the arity match up
        ; (floats3, rhs3)
@@ -717,7 +713,7 @@ it seems good for CorePrep to be robust.
 
 ---------------
 cpeJoinPair :: CorePrepEnv -> JoinId -> CoreExpr
-            -> UniqSM (JoinId, CpeRhs)
+            -> UniqSM (JoinId, CpeBody)
 -- Used for all join bindings
 -- No eta-expansion: see Note [Do not eta-expand join points] in GHC.Core.Opt.Simplify.Utils
 cpeJoinPair env bndr rhs
@@ -729,7 +725,7 @@ cpeJoinPair env bndr rhs
 
        ; (env', bndrs') <- cpCloneBndrs env bndrs
 
-       ; body' <- cpeBodyNF env' body -- Will let-bind the body if it starts
+       ; body' <- cpeBody env' body -- Will let-bind the body if it starts
                                       -- with a lambda
 
        ; let rhs'  = mkCoreLams bndrs' body'
@@ -757,10 +753,20 @@ for us to mess with the arity because a join point is never exported.
 -}
 
 -- ---------------------------------------------------------------------------
---              CpeRhs: produces a result satisfying CpeRhs
+--              cpeBodyF: produces a result satisfying CpeBody
 -- ---------------------------------------------------------------------------
 
-cpeRhsE :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeRhs)
+cpeBodyF :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeBody)
+-- | Convert a 'CoreExpr' so it satisfies 'CpeBody'; also produce
+-- a list of 'Floats' which are being propagated upwards.  In
+-- fact, this function is used in only two cases: to
+-- implement 'cpeBody' (which is what you usually want),
+-- and in the case when a let-binding is in a case scrutinee--here,
+-- we can always float out:
+--
+--      case (let x = y in z) of ...
+--      ==> let x = y in case z of ...
+--
 -- If
 --      e  ===>  (bs, e')
 -- then
@@ -769,32 +775,32 @@ cpeRhsE :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeRhs)
 -- For example
 --      f (g x)   ===>   ([v = g x], f v)
 
-cpeRhsE env (Type ty)
+cpeBodyF env (Type ty)
   = return (emptyFloats, Type (cpSubstTy env ty))
-cpeRhsE env (Coercion co)
+cpeBodyF env (Coercion co)
   = return (emptyFloats, Coercion (cpSubstCo env co))
-cpeRhsE env expr@(Lit lit)
+cpeBodyF env expr@(Lit lit)
   | LitNumber LitNumBigNat i <- lit
     = cpeBigNatLit env i
   | otherwise = return (emptyFloats, expr)
-cpeRhsE env expr@(Var {})  = cpeApp env expr
-cpeRhsE env expr@(App {})  = cpeApp env expr
+cpeBodyF env expr@(Var {})  = cpeApp env expr
+cpeBodyF env expr@(App {})  = cpeApp env expr
 
-cpeRhsE env (Let bind body)
+cpeBodyF env (Let bind body)
   = do { (env', bind_floats, maybe_bind') <- cpeBind NotTopLevel env bind
-       ; (body_floats, body') <- cpeRhsE env' body
+       ; (body_floats, body') <- cpeBodyF env' body
        ; let expr' = case maybe_bind' of Just bind' -> Let bind' body'
                                          Nothing    -> body'
        ; return (bind_floats `appFloats` body_floats, expr') }
 
-cpeRhsE env (Tick tickish expr)
+cpeBodyF env (Tick tickish expr)
   -- Pull out ticks if they are allowed to be floated.
   | tickishFloatable tickish
-  = do { (floats, body) <- cpeRhsE env expr
+  = do { (floats, body) <- cpeBodyF env expr
          -- See [Floating Ticks in CorePrep]
        ; return (FloatTick tickish `consFloat` floats, body) }
   | otherwise
-  = do { body <- cpeBodyNF env expr
+  = do { body <- cpeBody env expr
        ; return (emptyFloats, mkTick tickish' body) }
   where
     tickish' | Breakpoint ext bid fvs <- tickish
@@ -803,17 +809,17 @@ cpeRhsE env (Tick tickish expr)
              | otherwise
              = tickish
 
-cpeRhsE env (Cast expr co)
-   = do { (floats, expr') <- cpeRhsE env expr
+cpeBodyF env (Cast expr co)
+   = do { (floats, expr') <- cpeBodyF env expr
         ; return (floats, Cast expr' (cpSubstCo env co)) }
 
-cpeRhsE env expr@(Lam {})
+cpeBodyF env expr@(Lam {})
    = do { let (bndrs,body) = collectBinders expr
         ; (env', bndrs') <- cpCloneBndrs env bndrs
-        ; body' <- cpeBodyNF env' body
+        ; body' <- cpeBody env' body
         ; return (emptyFloats, mkLams bndrs' body') }
 
-cpeRhsE env (Case scrut bndr _ alts@[Alt con [covar] _])
+cpeBodyF env (Case scrut bndr _ alts@[Alt con [covar] _])
   -- See (U3) in Note [Implementing unsafeCoerce]
   -- We need make the Case float, otherwise we get
   --   let x = case ... of UnsafeRefl co ->
@@ -828,7 +834,7 @@ cpeRhsE env (Case scrut bndr _ alts@[Alt con [covar] _])
   -- Note that `x` is a value here. This is visible in the GHCi debugger tests
   -- (such as `print003`).
   | Just rhs <- isUnsafeEqualityCase scrut bndr alts
-  = do { (floats_scrut, scrut) <- cpeBody env scrut
+  = do { (floats_scrut, scrut) <- cpeBodyF env scrut
 
        ; (env, bndr')  <- cpCloneBndr env bndr
        ; (env, covar') <- cpCloneCoVarBndr env covar
@@ -836,19 +842,19 @@ cpeRhsE env (Case scrut bndr _ alts@[Alt con [covar] _])
                           -- See Note [Cloning CoVars and TyVars]
 
          -- Up until here this should do exactly the same as the regular code
-         -- path of `cpeRhsE Case{}`.
-       ; (floats_rhs, rhs) <- cpeBody env rhs
+         -- path of `cpeBodyF Case{}`.
+       ; (floats_rhs, rhs) <- cpeBodyF env rhs
          -- ... but we want to float `floats_rhs` as in (U3) so that rhs' might
          -- become a value
        ; let case_float = UnsafeEqualityCase scrut bndr' con [covar']
          -- NB: It is OK to "evaluate" the proof eagerly.
          --     Usually there's the danger that we float the unsafeCoerce out of
          --     a branching Case alt. Not so here, because the regular code path
-         --     for `cpeRhsE Case{}` will not float out of alts.
+         --     for `cpeBodyF Case{}` will not float out of alts.
              floats = snocFloat floats_scrut case_float `appFloats` floats_rhs
        ; return (floats, rhs) }
 
-cpeRhsE env (Case scrut bndr _ [Alt (DataAlt dc) [token_out, res] rhs])
+cpeBodyF env (Case scrut bndr _ [Alt (DataAlt dc) [token_out, res] rhs])
   -- See item (SEQ4) of Note [seq# magic]. We want to match
   --   case seq# @a @RealWorld <ok-to-discard> s of (# s', _ #) -> rhs[s']
   -- and simplify to rhs[s]. Triggers in T15226.
@@ -869,10 +875,10 @@ cpeRhsE env (Case scrut bndr _ [Alt (DataAlt dc) [token_out, res] rhs])
       -- often zaps the OccInfo on case-alternative binders (see Note [DataAlt occ info]
       -- in GHC.Core.Opt.Simplify.Iteration) because the scrutinee is not a
       -- variable, and in that case the zapping doesn't happen; see that Note.
-  = cpeRhsE (extendCorePrepEnv env token_out token_in') rhs
+  = cpeBodyF (extendCorePrepEnv env token_out token_in') rhs
 
-cpeRhsE env (Case scrut bndr ty alts)
-  = do { (floats, scrut') <- cpeBody env scrut
+cpeBodyF env (Case scrut bndr ty alts)
+  = do { (floats, scrut') <- cpeBodyF env scrut
        ; (env', bndr2) <- cpCloneBndr env bndr
        ; let bndr3 = bndr2 `setIdUnfolding` evaldUnfolding
        ; let alts'
@@ -885,7 +891,7 @@ cpeRhsE env (Case scrut bndr ty alts)
                , not (altsAreExhaustive alts)
                = addDefault alts (Just err)
                | otherwise = alts
-               where err = mkImpossibleExpr ty "cpeRhsE: missing case alternative"
+               where err = mkImpossibleExpr ty "cpeBodyF: missing case alternative"
        ; alts'' <- mapM (sat_alt env') alts'
 
        ; case alts'' of
@@ -896,7 +902,7 @@ cpeRhsE env (Case scrut bndr ty alts)
   where
     sat_alt env (Alt con bs rhs)
        = do { (env2, bs') <- cpCloneBndrs env bs
-            ; rhs' <- cpeBodyNF env2 rhs
+            ; rhs' <- cpeBody env2 rhs
             ; return (Alt con bs' rhs') }
 
 -- ---------------------------------------------------------------------------
@@ -908,74 +914,10 @@ cpeRhsE env (Case scrut bndr ty alts)
 -- let-bound using 'wrapBinds').  Generally you want this, esp.
 -- when you've reached a binding form (e.g., a lambda) and
 -- floating any further would be incorrect.
-cpeBodyNF :: CorePrepEnv -> CoreExpr -> UniqSM CpeBody
-cpeBodyNF env expr
-  = do { (floats, body) <- cpeBody env expr
-       ; return (wrapBinds floats body) }
-
--- | Convert a 'CoreExpr' so it satisfies 'CpeBody'; also produce
--- a list of 'Floats' which are being propagated upwards.  In
--- fact, this function is used in only two cases: to
--- implement 'cpeBodyNF' (which is what you usually want),
--- and in the case when a let-binding is in a case scrutinee--here,
--- we can always float out:
---
---      case (let x = y in z) of ...
---      ==> let x = y in case z of ...
---
-cpeBody :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeBody)
+cpeBody :: CorePrepEnv -> CoreExpr -> UniqSM CpeBody
 cpeBody env expr
-  = do { (floats1, rhs) <- cpeRhsE env expr
-       ; (floats2, body) <- rhsToBody env rhs
-       ; return (floats1 `appFloats` floats2, body) }
-
---------
-rhsToBody :: CorePrepEnv -> CpeRhs -> UniqSM (Floats, CpeBody)
--- Remove top level lambdas by let-binding
-
-rhsToBody env (Tick t expr)
-  | tickishHasNoScope t -- only float out of non-scoped annotations
-  = do { (floats, expr') <- rhsToBody env expr
-       ; return (floats, mkTick t expr') }
-
-rhsToBody env (Cast e co)
-        -- You can get things like
-        --      case e of { p -> coerce t (\s -> ...) }
-  = do { (floats, e') <- rhsToBody env e
-       ; return (floats, Cast e' co) }
-
-rhsToBody env expr@(Lam {})   -- See Note [No eta reduction needed in rhsToBody]
-  | all isTyVar bndrs           -- Type lambdas are ok
-  = return (emptyFloats, expr)
-  | otherwise                   -- Some value lambdas
-  = do { let rhs = cpeEtaExpand (exprArity expr) expr
-       ; fn <- newVar env (exprType rhs)
-       ; let float = Float (NonRec fn rhs) LetBound TopLvlFloatable
-       ; return (unitFloat float, Var fn) }
-  where
-    (bndrs,_) = collectBinders expr
-
-rhsToBody _env expr = return (emptyFloats, expr)
-
-
-{- Note [No eta reduction needed in rhsToBody]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Historical note.  In the olden days we used to have a Prep-specific
-eta-reduction step in rhsToBody:
-  rhsToBody expr@(Lam {})
-    | Just no_lam_result <- tryEtaReducePrep bndrs body
-    = return (emptyFloats, no_lam_result)
-
-The goal was to reduce
-        case x of { p -> \xs. map f xs }
-    ==> case x of { p -> map f }
-
-to avoid allocating a lambda.  Of course, we'd allocate a PAP
-instead, which is hardly better, but that's the way it was.
-
-Now we simply don't bother with this. It doesn't seem to be a win,
-and it's extra work.
--}
+  = do { (floats, body) <- cpeBodyF env expr
+       ; return (wrapBinds floats body) }
 
 -- ---------------------------------------------------------------------------
 --              CpeApp: produces a result satisfying CpeApp
@@ -1060,8 +1002,8 @@ body of the eta-expansion lambda, resulting in
 which is unproblematic.
 -}
 
-cpeApp :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeRhs)
--- May return a CpeRhs (instead of CpeApp) because of saturating primops
+cpeApp :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CpeBody)
+-- May return a CpeBody (instead of CpeApp) because of saturating primops
 cpeApp top_env expr
   = do { let (terminal, args) = collect_args expr
       --  ; pprTraceM "cpeApp" $ (ppr expr)
@@ -1103,7 +1045,7 @@ cpeApp top_env expr
     cpe_app :: CorePrepEnv
             -> CoreExpr -- The thing we are calling
             -> [ArgInfo]
-            -> UniqSM (Floats, CpeRhs)
+            -> UniqSM (Floats, CpeBody)
     cpe_app env (Var f) (AIApp Type{} : AIApp arg : args)
         | f `hasKey` lazyIdKey          -- Replace (lazy a) with a, and
             -- See Note [lazyId magic] in GHC.Types.Id.Make
@@ -1156,7 +1098,7 @@ cpeApp top_env expr
         --          case thing of res { __DEFAULT -> (# token, res#) } },
         -- allocating CaseBound Floats for token and thing as needed
         = do { (floats1, token) <- cpeArg env topDmd token
-             ; (floats2, thing) <- cpeBody env thing
+             ; (floats2, thing) <- cpeBodyF env thing
              ; case_bndr <- (`setIdUnfolding` evaldUnfolding) <$> newVar env ty
              ; let tup = mkCoreUnboxedTuple [token, Var case_bndr]
              ; let float = mkCaseFloat case_bndr thing
@@ -1173,9 +1115,10 @@ cpeApp top_env expr
                      then Just $! idArity v_hd
                      else Nothing
                    Nothing -> Nothing
-          --  ; pprTraceM "cpe_app:stricts:" (ppr v <+> ppr args $$ ppr stricts $$ ppr (idCbvMarks_maybe v))
            ; (app, floats, unsat_ticks) <- rebuild_app env args e2 emptyFloats stricts min_arity
-           ; mb_saturate hd app floats unsat_ticks depth }
+           ; case hd of
+               Nothing    -> do { massert (null unsat_ticks); return (floats, app) }
+               Just fn_id -> return (floats, maybeSaturate fn_id app depth unsat_ticks) }
         where
           depth = val_args args
           stricts = case idDmdSig v of
@@ -1190,8 +1133,8 @@ cpeApp top_env expr
                 -- partial application might be seq'd
 
         -- We inlined into something that's not a var and has no args.
-        -- Bounce it back up to cpeRhsE.
-    cpe_app env fun [] = cpeRhsE env fun
+        -- Bounce it back up to cpeBodyF.
+    cpe_app env fun [] = cpeBodyF env fun
 
     -- Here we get:
     -- N-variable fun, better let-bind it
@@ -1202,7 +1145,8 @@ cpeApp top_env expr
                           -- If evalDmd says that it's sure to be evaluated,
                           -- we'll end up case-binding it
            ; (app, floats,unsat_ticks) <- rebuild_app env args fun' fun_floats [] Nothing
-           ; mb_saturate Nothing app floats unsat_ticks (val_args args) }
+           ; massert (null unsat_ticks)
+           ; return (floats, app) }
 
     -- Count the number of value arguments *and* coercions (since we don't eliminate the later in STG)
     val_args :: [ArgInfo] -> Int
@@ -1223,13 +1167,6 @@ cpeApp top_env expr
                   | isTypeArg e = n
                   | otherwise   = n+1
 
-    -- Saturate if necessary
-    mb_saturate head app floats unsat_ticks depth =
-       case head of
-         Just fn_id -> do { sat_app <- maybeSaturate fn_id app depth unsat_ticks
-                          ; return (floats, sat_app) }
-         _other     -> do { massert (null unsat_ticks)
-                          ; return (floats, app) }
 
     -- Deconstruct and rebuild the application, floating any non-atomic
     -- arguments to the outside.  We collect the type of the expression,
@@ -1561,11 +1498,11 @@ Wrinkles:
 cpeArg :: CorePrepEnv -> Demand
        -> CoreArg -> UniqSM (Floats, CpeArg)
 cpeArg env dmd arg
-  = do { (floats1, arg1) <- cpeRhsE env arg     -- arg1 can be a lambda
+  = do { (floats1, arg1) <- cpeBodyF env arg     -- arg1 can be a lambda
        ; let arg_ty = exprType arg1
              lev    = typeLevity arg_ty
              dec    = wantFloatLocal NonRecursive dmd lev floats1 arg1
-       ; (floats2, arg2) <- executeFloatDecision env dec floats1 arg1
+             (floats2, arg2) = executeFloatDecision dec floats1 arg1
                 -- Else case: arg1 might have lambdas, and we can't
                 --            put them inside a wrapBinds
 
@@ -1580,7 +1517,12 @@ cpeArg env dmd arg
                        arg3  = cpeEtaExpand arity arg2
                        -- See Note [Eta expansion of arguments in CorePrep]
                  ; let (arg_float, v') = mkNonRecFloat env lev v arg3
-                 ---; pprTraceM "cpeArg" (ppr arg1 $$ ppr dec $$ ppr arg2)
+--                 ; pprTraceM "cpeArg" (vcat [ text "arg1" <+> ppr arg1
+--                                            , text "decision" <+>  ppr dec
+--                                            , text "arg2" <+> ppr arg2
+--                                            , text "arity" <+> ppr arity
+--                                            , text "arg3" <+> ppr arg3
+--                                            ])
                  ; return (snocFloat floats2 arg_float, varToCoreExpr v') }
        }
 
@@ -1617,59 +1559,56 @@ eta_would_wreck_join (Tick _ e)        = eta_would_wreck_join e
 eta_would_wreck_join (Case _ _ _ alts) = any eta_would_wreck_join (rhssOfAlts alts)
 eta_would_wreck_join _                 = False
 
-maybeSaturate :: Id -> CpeApp -> Int -> [CoreTickish] -> UniqSM CpeRhs
+maybeSaturate :: Id -> CpeApp
+              -> Int  -- Number of value arguments in the application
+              -> [CoreTickish]
+              -> CpeBody
 maybeSaturate fn expr n_args unsat_ticks
-  | hasNoBinding fn        -- There's no binding
-    -- See Note [Eta expansion of hasNoBinding things in CorePrep]
-  = return $ wrapLamBody (\body -> foldr mkTick body unsat_ticks) sat_expr
-
-  | mark_arity > 0 -- A call-by-value function.
-                   -- See Note [CBV Function Ids: overview]
-  , not applied_marks
-  = assertPpr
-      ( not (isJoinId fn)) -- See Note [Do not eta-expand join points]
-      ( ppr fn $$ text "expr:" <+> ppr expr $$ text "n_args:" <+> ppr n_args $$
-          text "marks:" <+> ppr (idCbvMarks_maybe fn) $$
-          text "join_arity" <+> ppr (idJoinPointHood fn) $$
-          text "fn_arity" <+> ppr fn_arity
-       ) $
-    -- pprTrace "maybeSat"
-    --   ( ppr fn $$ text "expr:" <+> ppr expr $$ text "n_args:" <+> ppr n_args $$
-    --       text "marks:" <+> ppr (idCbvMarks_maybe fn) $$
-    --       text "join_arity" <+> ppr (isJoinId_maybe fn) $$
-    --       text "fn_arity" <+> ppr fn_arity $$
-    --       text "excess_arity" <+> ppr excess_arity $$
-    --       text "mark_arity" <+> ppr mark_arity
-    --    ) $
-    return sat_expr
+  | isJoinId fn  -- Never eta-expand a call to a join point
+                 -- See Note [Do not eta-expand join points]
+  = assertPpr (not must_eta_expand) (ppr expr) $
+    -- assertPpr: check that all arguments that need to be passed cbv
+    -- are visible, so the backend can evalaute them if required
+    expr
+
+  | must_eta_expand || desirable_to_eta_expand
+    -- n_args > 0: do not eta-expand a naked variable!
+  = wrapLamBody (mkTicks unsat_ticks) $
+    cpeEtaExpand excess_arity expr
 
   | otherwise
-  = assert (null unsat_ticks) $
-    return expr
+  = expr
+
   where
-    mark_arity    = idCbvMarkArity fn
-    fn_arity      = idArity fn
-    excess_arity  = (max fn_arity mark_arity) - n_args
-    sat_expr      = cpeEtaExpand excess_arity expr
-    applied_marks = n_args >= (length . dropWhile (not . isMarkedCbv) .
-                               reverse . expectJust $ (idCbvMarks_maybe fn))
-    -- For join points we never eta-expand (See Note [Do not eta-expand join points])
-    -- so we assert all arguments that need to be passed cbv are visible so that the
-    -- backend can evalaute them if required..
+    must_eta_expand
+      =  (hasNoBinding fn && fn_arity > n_args)
+            -- hasNoBinding functions must be saturated
+      || (mark_arity > n_args)
+            -- CBV functions must be CBV-saturated
+
+    desirable_to_eta_expand = fn_arity > n_args && n_args > 0
+       -- n_args > 0: do not eta-expand a naked variable unless we have to
+
+    mark_arity   = idCbvMarkArity fn
+    fn_arity     = idArity fn
+    excess_arity = (max fn_arity mark_arity) - n_args
 
 {- Note [Eta expansion]
 ~~~~~~~~~~~~~~~~~~~~~~~
-Eta expand to match the arity claimed by the binder Remember,
-CorePrep must not change arity
+Eta expand to match the arity claimed by the binder.
+Remember, CorePrep must not change arity
 
 Eta expansion might not have happened already, because it is done by
 the simplifier only when there at least one lambda already.
 
-NB1:we could refrain when the RHS is trivial (which can happen
-    for exported things).  This would reduce the amount of code
-    generated (a little) and make things a little worse for
-    code compiled without -O.  The case in point is data constructor
-    wrappers.
+We do eta-expansion (via `cpeEtaExpand`) in three places:
+
+* At let-bindings; in `cpePair`
+
+* On function arguments: in `cpeArg`
+  See Note [Eta expansion of arguments in CorePrep]
+
+* At un-saturated function calls: in `maybeSaturate`
 
 NB2: we have to be careful that the result of etaExpand doesn't
    invalidate any of the assumptions that CorePrep is attempting
@@ -1677,12 +1616,37 @@ NB2: we have to be careful that the result of etaExpand doesn't
    an SCC note - we're now careful in etaExpand to make sure the
    SCC is pushed inside any new lambdas that are generated.
 
-Note [Eta expansion of hasNoBinding things in CorePrep]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-maybeSaturate deals with eta expanding to saturate things that can't deal
-with unsaturated applications (identified by 'hasNoBinding', currently
-foreign calls, unboxed tuple/sum constructors, and representation-polymorphic
-primitives such as 'coerce' and 'unsafeCoerce#').
+Note [Eta expansion for let-bindings]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Given f = rhs, we eta-expand `rhs` to match f's arity.
+
+We could refrain when the RHS is trivial (which can happen for exported things).
+This would reduce the amount of code generated (a little) and make things a
+little worse for code compiled without -O.  The case in point is data
+constructor wrappers.
+
+Note [Eta expansion of unsaturated calls]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Give a call (f a1..an), where `f` is a known function with arity greater than `n`,
+there are three reasons we might want to eta-expand:
+
+* Must eta-expand: if `f` is a `hasNoBinding` function, we must saturate
+  it, because the function has no (curried) binding to call. Currently
+  this includes:
+     - foreign calls,
+     - unboxed tuple/sum constructors
+     - representation-polymorphic primitives such as 'coerce' and 'unsafeCoerce#'
+     - primops (for now anyway; see comments in `hasNoBinding`)
+
+* Must eta-expand: if `f` has a call-by-value calling convention, we /must/
+  call it with evaluated arguments.  The back end deals with adding the
+  necessary evaluation at the call site, but we must first ensure that it is
+  saturated.
+
+* May eta-expand: consider
+     \x -> f x True
+  where `f` has arity 3.   Then it's much better to eta-expand f so we have
+     \xy -> f x True y
 
 Historical Note: Note that eta expansion in CorePrep used to be very fragile
 due to the "prediction" of CAFfyness that we used to make during tidying.  We
@@ -1694,7 +1658,7 @@ Note [Eta expansion and the CorePrep invariants]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 It turns out to be much much easier to do eta expansion
 *after* the main CorePrep stuff.  But that places constraints
-on the eta expander: given a CpeRhs, it must return a CpeRhs.
+on the eta expander: given a CpeBody, it must return a CpeBody.
 
 For example here is what we do not want:
                 f = /\a -> g (h 3)      -- h has arity 2
@@ -1706,6 +1670,26 @@ and now we do NOT want eta expansion to give
 Instead GHC.Core.Opt.Arity.etaExpand gives
                 f = /\a -> \y -> let s = h 3 in g s y
 
+Another example:
+  f x = case x of
+           A -> \y. e
+           B -> hnb 3  -- where `hnb` has no binding
+           C -> z
+Then we may eta-expand `hnb` to get
+  f x = case x of
+           A -> \y. e
+           B -> \y. hnb 3 y
+           C -> z
+Now we come to the binding of `f` itself, and eta-expand that, to give
+  f x y = case x of
+            A -> e
+            B -> hnb 3 y
+            C -> z y
+Notice how important it is that the eta-expansion for `f` doesn't
+generate any crap like
+            B -> (\y. hnb 3 y) y
+Fortunately, the eta-expander is careful not to do so.
+
 Note [Eta expansion of arguments in CorePrep]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Suppose `g = \x y. blah` and consider the expression `f (g x)`; we ANFise to
@@ -1798,7 +1782,7 @@ There is a nasty Wrinkle:
       #24471 is a good example, where Prep took 25% of compile time!
 -}
 
-cpeEtaExpand :: Arity -> CpeRhs -> CpeRhs
+cpeEtaExpand :: Arity -> CpeBody -> CpeBody
 cpeEtaExpand arity expr
   | arity == 0 = expr
   | otherwise  = etaExpand arity expr
@@ -2165,9 +2149,6 @@ isEmptyFloats (Floats _ b) = isNilOL b
 getFloats :: Floats -> OrdList FloatingBind
 getFloats = fs_binds
 
-unitFloat :: FloatingBind -> Floats
-unitFloat = snocFloat emptyFloats
-
 floatInfo :: FloatingBind -> FloatInfo
 floatInfo (Float _ _ info)     = info
 floatInfo UnsafeEqualityCase{} = LazyContextFloatable -- See Note [Floating in CorePrep]
@@ -2255,7 +2236,7 @@ decideFloatInfo FIA{fia_levity=lev, fia_demand=dmd, fia_is_hnf=is_hnf,
   | Lifted   <- lev       = (LetBound, TopLvlFloatable)
       -- And these float freely but can't be speculated, hence LetBound
 
-mkCaseFloat :: Id -> CpeRhs -> FloatingBind
+mkCaseFloat :: Id -> CpeBody -> FloatingBind
 mkCaseFloat bndr scrut
   = -- pprTrace "mkCaseFloat" (ppr bndr <+> ppr (bound,info)
     --                             -- <+> ppr is_lifted <+> ppr is_strict
@@ -2273,7 +2254,7 @@ mkCaseFloat bndr scrut
           -- (ok-for-spec case bindings are unlikely anyway.)
       }
 
-mkNonRecFloat :: CorePrepEnv -> Levity -> Id -> CpeRhs -> (FloatingBind, Id)
+mkNonRecFloat :: CorePrepEnv -> Levity -> Id -> CpeBody -> (FloatingBind, Id)
 mkNonRecFloat env lev bndr rhs
   = -- pprTrace "mkNonRecFloat" (ppr bndr <+> ppr (bound,info)
     --                             <+> if is_strict then text "strict" else if is_lifted then text "lazy" else text "unlifted"
@@ -2413,24 +2394,18 @@ instance Outputable FloatDecision where
   ppr FloatNone = text "none"
   ppr FloatAll  = text "all"
 
-executeFloatDecision :: CorePrepEnv -> FloatDecision -> Floats -> CpeRhs -> UniqSM (Floats, CpeRhs)
-executeFloatDecision env dec floats rhs
+executeFloatDecision :: FloatDecision -> Floats -> CpeBody -> (Floats, CpeBody)
+executeFloatDecision dec floats rhs
   = case dec of
-      FloatAll                 -> return (floats, rhs)
-      FloatNone
-        | isEmptyFloats floats -> return (emptyFloats, rhs)
-        | otherwise            -> do { (floats', body) <- rhsToBody env rhs
-                                     ; return (emptyFloats, wrapBinds floats $
-                                                            wrapBinds floats' body) }
-            -- FloatNone case: `rhs` might have lambdas, and we can't
-            -- put them inside a wrapBinds, which expects a `CpeBody`.
+      FloatAll  -> (floats,      rhs)
+      FloatNone -> (emptyFloats, wrapBinds floats rhs)
 
 wantFloatTop :: Floats -> FloatDecision
 wantFloatTop fs
   | fs_info fs `floatsAtLeastAsFarAs` TopLvlFloatable = FloatAll
   | otherwise                                         = FloatNone
 
-wantFloatLocal :: RecFlag -> Demand -> Levity -> Floats -> CpeRhs -> FloatDecision
+wantFloatLocal :: RecFlag -> Demand -> Levity -> Floats -> CpeBody -> FloatDecision
 -- See Note [wantFloatLocal]
 wantFloatLocal is_rec rhs_dmd rhs_lev floats rhs
   |  isEmptyFloats floats -- Well yeah...
@@ -2479,7 +2454,7 @@ zero free variables.)
 In general, the inliner is good at eliminating these let-bindings.  However,
 there is one case where these trivial updatable thunks can arise: when
 we are optimizing away 'lazy' (see Note [lazyId magic], and also
-'cpeRhsE'.)  Then, we could have started with:
+'cpeBodyF'.)  Then, we could have started with:
 
      let x :: ()
          x = lazy @() y
@@ -2783,8 +2758,7 @@ wrapTicks floats expr
 -- ---------------------------------------------------------------------------
 
 -- | Converts Bignum literals into their final CoreExpr
-cpeBigNatLit
-   :: CorePrepEnv -> Integer -> UniqSM (Floats, CpeRhs)
+cpeBigNatLit :: CorePrepEnv -> Integer -> UniqSM (Floats, CpeBody)
 cpeBigNatLit env i = assert (i >= 0) $ do
   let
     platform = cp_platform (cpe_config env)


=====================================
compiler/GHC/Driver/Main.hs
=====================================
@@ -2434,8 +2434,8 @@ myCoreToStg :: Logger -> DynFlags -> [Var]
                   , CollectedCCs -- CAF cost centre info (declared and used)
                   , StgCgInfos )
 myCoreToStg logger dflags ic_inscope for_bytecode this_mod ml prepd_binds = do
-    let (stg_binds, denv, cost_centre_info)
-         = {-# SCC "Core2Stg" #-}
+    (stg_binds, denv, cost_centre_info)
+       <- {-# SCC "Core2Stg" #-}
            coreToStg (initCoreToStgOpts dflags) this_mod ml prepd_binds
 
     (stg_binds_with_fvs,stg_cg_info)


=====================================
compiler/GHC/Stg/Lint.hs
=====================================
@@ -105,7 +105,7 @@ import GHC.Core             ( AltCon(..) )
 import GHC.Core.Type
 import GHC.Core.Lint        ( lintMessage )
 
-import GHC.Types.Basic      ( TopLevelFlag(..), isTopLevel, isMarkedCbv )
+import GHC.Types.Basic      ( TopLevelFlag(..), isTopLevel )
 import GHC.Types.CostCentre ( isCurrentCCS )
 import GHC.Types.Id
 import GHC.Types.Var.Set
@@ -123,12 +123,9 @@ import GHC.Unit.Module            ( Module )
 import GHC.Data.Bag         ( Bag, emptyBag, isEmptyBag, snocBag, bagToList )
 
 import Control.Monad
-import Data.Maybe
-import GHC.Utils.Misc
 import GHC.Core.Multiplicity (scaledThing)
 import GHC.Settings (Platform)
 import GHC.Core.TyCon (primRepCompatible, primRepsCompatible)
-import GHC.Utils.Panic.Plain (panic)
 
 lintStgTopBindings :: forall a . (OutputablePass a, BinderP a ~ Id)
                    => Platform
@@ -174,36 +171,37 @@ lintStgTopBindings platform logger diag_opts opts extra_vars this_mod unarised w
     lint_bind (StgTopStringLit v _) = return [v]
 
 lintStgConArg :: StgArg -> LintM ()
-lintStgConArg arg = do
-  unarised <- lf_unarised <$> getLintFlags
-  when unarised $ case stgArgRep_maybe arg of
-    -- Note [Post-unarisation invariants], invariant 4
-    Just [_] -> pure ()
-    badRep   -> addErrL $
-      text "Non-unary constructor arg: " <> ppr arg $$
-      text "Its PrimReps are: " <> ppr badRep
-
-  case arg of
-    StgLitArg _ -> pure ()
-    StgVarArg v -> lintStgVar v
+lintStgConArg arg
+  = do { lintStgArg arg
+
+       ; unarised <- lf_unarised <$> getLintFlags
+       ; when unarised $ case stgArgRep_maybe arg of
+           -- Note [Post-unarisation invariants], invariant 4
+           Just [_] -> pure ()
+           badRep   -> addErrL $
+             text "Non-unary constructor arg: " <> ppr arg $$
+             text "Its PrimReps are: " <> ppr badRep }
 
 lintStgFunArg :: StgArg -> LintM ()
-lintStgFunArg arg = do
-  unarised <- lf_unarised <$> getLintFlags
-  when unarised $ case stgArgRep_maybe arg of
-    -- Note [Post-unarisation invariants], invariant 3
-    Just []  -> pure ()
-    Just [_] -> pure ()
-    badRep   -> addErrL $
-      text "Function arg is not unary or void: " <> ppr arg $$
-      text "Its PrimReps are: " <> ppr badRep
-
-  case arg of
-    StgLitArg _ -> pure ()
-    StgVarArg v -> lintStgVar v
-
-lintStgVar :: Id -> LintM ()
-lintStgVar id = checkInScope id
+lintStgFunArg arg
+  = do { lintStgArg arg
+
+       ; unarised <- lf_unarised <$> getLintFlags
+       ; when unarised $ case stgArgRep_maybe arg of
+           -- Note [Post-unarisation invariants], invariant 3
+           Just []  -> pure ()
+           Just [_] -> pure ()
+           badRep   -> addErrL $
+             text "Function arg is not unary or void: " <> ppr arg $$
+             text "Its PrimReps are: " <> ppr badRep }
+
+lintStgArg :: StgArg -> LintM ()
+lintStgArg (StgLitArg _) = pure ()
+lintStgArg (StgVarArg v) = do { lintStgVarOcc v
+                              ; lintAppCbvMarks v [] }
+
+lintStgVarOcc :: Id -> LintM ()
+lintStgVarOcc id = checkInScope id
 
 lintStgBinds
     :: (OutputablePass a, BinderP a ~ Id)
@@ -275,13 +273,11 @@ lintStgExpr :: (OutputablePass a, BinderP a ~ Id) => GenStgExpr a -> LintM ()
 
 lintStgExpr (StgLit _) = return ()
 
-lintStgExpr e@(StgApp fun args) = do
-  lintStgVar fun
-  mapM_ lintStgFunArg args
-  lintAppCbvMarks e
-  lintStgAppReps fun args
-
-
+lintStgExpr (StgApp fun args)
+  = do { lintStgVarOcc fun
+       ; mapM_ lintStgFunArg args
+       ; lintAppCbvMarks fun args
+       ; lintStgAppReps fun args }
 
 lintStgExpr app@(StgConApp con _n args _arg_tys) = do
     -- unboxed sums should vanish during unarise
@@ -413,22 +409,20 @@ lintStgAppReps fun args = do
 
   match_args actual_arg_reps fun_arg_tys_reps
 
-lintAppCbvMarks :: OutputablePass pass
-                => GenStgExpr pass -> LintM ()
-lintAppCbvMarks e@(StgApp fun args) = do
-  lf <- getLintFlags
-  when (lf_unarised lf) $ do
+lintAppCbvMarks :: Id -> [StgArg] -> LintM ()
+lintAppCbvMarks fun args
+  | idCbvMarkArity fun > length args
     -- A function which expects a unlifted argument as n'th argument
     -- always needs to be applied to n arguments.
     -- See Note [CBV Function Ids: overview].
-    let marks = fromMaybe [] $ idCbvMarks_maybe fun
-    when (length (dropWhileEndLE (not . isMarkedCbv) marks) > length args) $ do
-      addErrL $ hang (text "Undersatured cbv marked ID in App" <+> ppr e ) 2 $
-        (text "marks" <> ppr marks $$
-        text "args" <> ppr args $$
-        text "arity" <> ppr (idArity fun) $$
-        text "join_arity" <> ppr (idJoinPointHood fun))
-lintAppCbvMarks _ = panic "impossible - lintAppCbvMarks"
+  = addErrL $ hang (text "Undersatured cbv marked ID in App" <+> ppr fun)
+                 2 (vcat [ text "marks" <> ppr (idCbvMarks_maybe fun)
+                         , text "args" <> ppr args
+                         , text "arity" <> ppr (idArity fun)
+                         , text "join_arity" <> ppr (idJoinPointHood fun) ])
+
+  | otherwise
+  = return ()
 
 {-
 ************************************************************************


=====================================
compiler/GHC/Types/Id.hs
=====================================
@@ -852,7 +852,7 @@ idCbvMarks_maybe id = case idDetails id of
   _                    -> Nothing
 
 -- Id must be called with at least this arity in order to allow arguments to
--- be passed unlifted.
+-- be passed unlifted.  Return 0 if there are no CBV marks.
 idCbvMarkArity :: Id -> Arity
 idCbvMarkArity fn = maybe 0 length (idCbvMarks_maybe fn)
 


=====================================
compiler/GHC/Types/Id/Info.hs
=====================================
@@ -210,6 +210,7 @@ data IdDetails
         -- Can also work as a WorkerLikeId if given `CbvMark`s.
         -- See Note [CBV Function Ids: overview]
         -- The [CbvMark] is always empty (and ignored) until after Tidy.
+
   | WorkerLikeId [CbvMark]
         -- ^ An 'Id' for a worker like function, which might expect some arguments to be
         -- passed both evaluated and tagged.
@@ -217,8 +218,10 @@ data IdDetails
         -- aren't used unapplied.
         -- See Note [CBV Function Ids: overview]
         -- See Note [EPT enforcement]
-        -- The [CbvMark] is always empty (and ignored) until after Tidy for ids from the current
-        -- module.
+        -- Invariants:
+        --   - the [CbvMark] is always empty (and ignored) until after Tidy
+        --     for ids from the current module
+        --   - If non-empty, at least is isMarkedCbbv; see (CBV2)
 
 data RecSelInfo
   = RSI { rsi_def   :: [ConLike]   -- Record selector defined for these
@@ -297,9 +300,7 @@ Here's how it all works:
   to identify strict arguments.  See Note [Call-by-value for worker args] for
   how a worker guarantees to be strict in strict datacon fields.
 
-  TODO: We currently don't do this for arguments that are unboxed sums or tuples,
-  because then we'd have to predict the result of unarisation. But it would be nice to
-  do so. See `computeCbvInfo`.
+  See (CBV1) and (CBV2).
 
 * During CorePrep calls to CBV Ids are eta expanded.
   See `GHC.CoreToStg.Prep.maybeSaturate`.
@@ -319,6 +320,16 @@ Here's how it all works:
 * Imported functions may be CBV, and then there is no point in eta-reducing
   them; we'll just have to eta-expand later; see GHC.Core.Opt.Arity.cantEtaReduceFun.
 
+Wrinkles
+
+(CBV1) We do not set the CBV-marks for a function that takes an unboxed sum or tuple,
+  as an argument, because then we'd have to predict the result of unarisation.
+  It would be nice to do so in future. See `computeCbvInfo`.
+
+(CBV2) We do not set CBV-marks if none of them are `isMarkedCbv`.  Why not?
+  Because if none are CBV then there is nothing special to do for this function;
+  in particular, we don't need to saturate its calls.  See `computeCbvInfo`.
+
 *** SPJ really? Andreas? ****
 We only use this for workers and specialized versions of SpecConstr
 But we also check other functions during tidy and potentially turn some of them into


=====================================
testsuite/tests/arityanal/should_compile/Arity01.stderr
=====================================
@@ -5,19 +5,19 @@ Result size of Tidy Core = {terms: 71, types: 43, coercions: 0, joins: 0/0}
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F1.f2 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F1.f2 = GHC.Num.Integer.IS 1#
+F1.f2 = GHC.Internal.Bignum.Integer.IS 1#
 
 Rec {
 -- RHS size: {terms: 24, types: 6, coercions: 0, joins: 0/0}
 F1.f1_h1 [Occ=LoopBreaker] :: Integer -> Integer -> Integer -> Integer
 [GblId, Arity=3, Str=<1L><1L><SL>, Unf=OtherCon []]
 F1.f1_h1
-  = \ (n :: Integer) (x :: Integer) (eta [OS=OneShot] :: Integer) ->
+  = \ (n :: Integer) (x [OS=OneShot] :: Integer) (eta [OS=OneShot] :: Integer) ->
       case x of x1 { __DEFAULT ->
       case n of y1 { __DEFAULT ->
-      case GHC.Num.Integer.integerLt# x1 y1 of {
+      case GHC.Internal.Bignum.Integer.integerLt# x1 y1 of {
         __DEFAULT -> eta;
-        1# -> F1.f1_h1 y1 (GHC.Num.Integer.integerAdd x1 F1.f2) (GHC.Num.Integer.integerAdd x1 eta)
+        1# -> F1.f1_h1 y1 (GHC.Internal.Bignum.Integer.integerAdd x1 F1.f2) (GHC.Internal.Bignum.Integer.integerAdd x1 eta)
       }
       }
       }
@@ -26,7 +26,7 @@ end Rec }
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F1.f3 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F1.f3 = GHC.Num.Integer.IS 5#
+F1.f3 = GHC.Internal.Bignum.Integer.IS 5#
 
 -- RHS size: {terms: 4, types: 0, coercions: 0, joins: 0/0}
 f1 :: Integer
@@ -36,27 +36,27 @@ f1 = F1.f1_h1 F1.f3 F1.f2 F1.f3
 -- RHS size: {terms: 14, types: 5, coercions: 0, joins: 0/0}
 g :: Integer -> Integer -> Integer -> Integer -> Integer -> Integer
 [GblId, Arity=5, Str=<1L><SL><SL><SL><SL>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [0 0 0 0 0] 120 0}]
-g = \ (x1 :: Integer) (x2 :: Integer) (x3 :: Integer) (x4 :: Integer) (x5 :: Integer) -> GHC.Num.Integer.integerAdd (GHC.Num.Integer.integerAdd (GHC.Num.Integer.integerAdd (GHC.Num.Integer.integerAdd x1 x2) x3) x4) x5
+g = \ (x1 :: Integer) (x2 :: Integer) (x3 :: Integer) (x4 :: Integer) (x5 :: Integer) -> GHC.Internal.Bignum.Integer.integerAdd (GHC.Internal.Bignum.Integer.integerAdd (GHC.Internal.Bignum.Integer.integerAdd (GHC.Internal.Bignum.Integer.integerAdd x1 x2) x3) x4) x5
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F1.s1 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F1.s1 = GHC.Num.Integer.IS 3#
+F1.s1 = GHC.Internal.Bignum.Integer.IS 3#
 
 -- RHS size: {terms: 8, types: 7, coercions: 0, joins: 0/0}
 s :: forall {t1} {t2}. Num t1 => (t1 -> t2) -> t2
-[GblId, Arity=2, Str=<1C(1,L)>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [30 60] 50 0}]
+[GblId, Arity=2, Str=<1C(1,L)>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [90 60] 50 0}]
 s = \ (@t) (@t1) ($dNum :: Num t) (f :: t -> t1) -> f (fromInteger @t $dNum F1.s1)
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F1.h1 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F1.h1 = GHC.Num.Integer.IS 24#
+F1.h1 = GHC.Internal.Bignum.Integer.IS 24#
 
 -- RHS size: {terms: 4, types: 1, coercions: 0, joins: 0/0}
 h :: Integer -> Integer
 [GblId, Arity=1, Str=<SL>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 30 0}]
-h = \ (x5 :: Integer) -> GHC.Num.Integer.integerAdd F1.h1 x5
+h = \ (x5 :: Integer) -> GHC.Internal.Bignum.Integer.integerAdd F1.h1 x5
 
 
 


=====================================
testsuite/tests/arityanal/should_compile/Arity05.stderr
=====================================
@@ -5,27 +5,27 @@ Result size of Tidy Core = {terms: 42, types: 44, coercions: 0, joins: 0/0}
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F5.f5g1 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F5.f5g1 = GHC.Num.Integer.IS 1#
+F5.f5g1 = GHC.Internal.Bignum.Integer.IS 1#
 
 -- RHS size: {terms: 12, types: 9, coercions: 0, joins: 0/0}
 f5g :: forall {a} {t}. Num a => (t -> a) -> t -> a
-[GblId, Arity=3, Str=<L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [60 60 0] 90 0}]
+[GblId, Arity=3, Str=<L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [180 60 0] 90 0}]
 f5g = \ (@a) (@t) ($dNum :: Num a) (h :: t -> a) (z :: t) -> + @a $dNum (h z) (fromInteger @a $dNum F5.f5g1)
 
 -- RHS size: {terms: 17, types: 12, coercions: 0, joins: 0/0}
 f5h :: forall {a} {t}. Num a => (t -> a) -> t -> (t -> a) -> a
-[GblId, Arity=4, Str=<L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [90 60 0 60] 150 0}]
+[GblId, Arity=4, Str=<L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [270 60 0 60] 150 0}]
 f5h = \ (@a) (@t) ($dNum :: Num a) (f :: t -> a) (x :: t) (g :: t -> a) -> + @a $dNum (f x) (+ @a $dNum (g x) (fromInteger @a $dNum F5.f5g1))
 
 -- RHS size: {terms: 4, types: 1, coercions: 0, joins: 0/0}
 f5y :: Integer -> Integer
 [GblId, Arity=1, Str=<1L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 30 0}]
-f5y = \ (y :: Integer) -> GHC.Num.Integer.integerAdd y F5.f5g1
+f5y = \ (y :: Integer) -> GHC.Internal.Bignum.Integer.integerAdd y F5.f5g1
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 f5 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-f5 = GHC.Num.Integer.IS 3#
+f5 = GHC.Internal.Bignum.Integer.IS 3#
 
 
 


=====================================
testsuite/tests/arityanal/should_compile/Arity08.stderr
=====================================
@@ -4,7 +4,7 @@ Result size of Tidy Core = {terms: 24, types: 18, coercions: 0, joins: 0/0}
 
 -- RHS size: {terms: 20, types: 10, coercions: 0, joins: 0/0}
 f8f :: forall {p}. Num p => Bool -> p -> p -> p
-[GblId, Arity=4, Str=<1L><L><L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [90 30 0 0] 140 0}]
+[GblId, Arity=4, Str=<1L><L><L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [270 30 0 0] 140 0}]
 f8f
   = \ (@p) ($dNum :: Num p) (b :: Bool) (x :: p) (y :: p) ->
       case b of {
@@ -15,7 +15,7 @@ f8f
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 f8 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-f8 = GHC.Num.Integer.IS 2#
+f8 = GHC.Internal.Bignum.Integer.IS 2#
 
 
 


=====================================
testsuite/tests/arityanal/should_compile/Arity11.stderr
=====================================
@@ -5,57 +5,23 @@ Result size of Tidy Core = {terms: 136, types: 75, coercions: 0, joins: 2/7}
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.fib3 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F11.fib3 = GHC.Num.Integer.IS 1#
+F11.fib3 = GHC.Internal.Bignum.Integer.IS 1#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.fib2 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F11.fib2 = GHC.Num.Integer.IS 2#
-
-Rec {
--- RHS size: {terms: 38, types: 13, coercions: 0, joins: 2/2}
-F11.f11_fib [Occ=LoopBreaker] :: Integer -> Integer
-[GblId, Arity=1, Str=<SL>, Unf=OtherCon []]
-F11.f11_fib
-  = \ (ds :: Integer) ->
-      join {
-        $j [Dmd=ML] :: Integer
-        [LclId[JoinId(0)(Nothing)]]
-        $j
-          = join {
-              $j1 [Dmd=ML] :: Integer
-              [LclId[JoinId(0)(Nothing)]]
-              $j1 = GHC.Num.Integer.integerAdd (F11.f11_fib (GHC.Num.Integer.integerSub ds F11.fib3)) (F11.f11_fib (GHC.Num.Integer.integerSub ds F11.fib2)) } in
-            case ds of {
-              GHC.Num.Integer.IS x1 ->
-                case x1 of {
-                  __DEFAULT -> jump $j1;
-                  1# -> F11.fib3
-                };
-              GHC.Num.Integer.IP x1 -> jump $j1;
-              GHC.Num.Integer.IN x1 -> jump $j1
-            } } in
-      case ds of {
-        GHC.Num.Integer.IS x1 ->
-          case x1 of {
-            __DEFAULT -> jump $j;
-            0# -> F11.fib3
-          };
-        GHC.Num.Integer.IP x1 -> jump $j;
-        GHC.Num.Integer.IN x1 -> jump $j
-      }
-end Rec }
+F11.fib2 = GHC.Internal.Bignum.Integer.IS 2#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.fib1 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F11.fib1 = GHC.Num.Integer.IS 0#
+F11.fib1 = GHC.Internal.Bignum.Integer.IS 0#
 
 -- RHS size: {terms: 54, types: 27, coercions: 0, joins: 0/5}
-fib :: forall {t} {a}. (Eq t, Num t, Num a) => t -> a
-[GblId, Arity=4, Str=<L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [60 150 60 0] 480 0}]
+fib :: forall {t1} {t2}. (Eq t1, Num t1, Num t2) => t1 -> t2
+[GblId, Arity=4, Str=<L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [180 450 180 0] 480 0}]
 fib
-  = \ (@t) (@a) ($dEq :: Eq t) ($dNum :: Num t) ($dNum1 :: Num a) (eta :: t) ->
+  = \ (@t) (@t1) ($dEq :: Eq t) ($dNum :: Num t) ($dNum1 :: Num t1) (eta :: t) ->
       let {
         lvl :: t
         [LclId]
@@ -65,32 +31,66 @@ fib
         [LclId]
         lvl1 = fromInteger @t $dNum F11.fib2 } in
       let {
-        lvl2 :: a
+        lvl2 :: t1
         [LclId]
-        lvl2 = fromInteger @a $dNum1 F11.fib3 } in
+        lvl2 = fromInteger @t1 $dNum1 F11.fib3 } in
       let {
         lvl3 :: t
         [LclId]
         lvl3 = fromInteger @t $dNum F11.fib1 } in
       letrec {
-        fib4 [Occ=LoopBreaker, Dmd=SC(S,L)] :: t -> a
+        fib4 [Occ=LoopBreaker, Dmd=SC(S,L)] :: t -> t1
         [LclId, Arity=1, Str=<L>, Unf=OtherCon []]
         fib4
           = \ (ds :: t) ->
               case == @t $dEq ds lvl3 of {
                 False ->
                   case == @t $dEq ds lvl of {
-                    False -> + @a $dNum1 (fib4 (- @t $dNum ds lvl)) (fib4 (- @t $dNum ds lvl1));
+                    False -> + @t1 $dNum1 (fib4 (- @t $dNum ds lvl)) (fib4 (- @t $dNum ds lvl1));
                     True -> lvl2
                   };
                 True -> lvl2
               }; } in
       fib4 eta
 
+Rec {
+-- RHS size: {terms: 38, types: 13, coercions: 0, joins: 2/2}
+F11.f11_fib [Occ=LoopBreaker] :: Integer -> Integer
+[GblId, Arity=1, Str=<SL>, Unf=OtherCon []]
+F11.f11_fib
+  = \ (ds :: Integer) ->
+      join {
+        $j [Dmd=ML] :: Integer
+        [LclId[JoinId(0)(Nothing)]]
+        $j
+          = join {
+              $j1 [Dmd=ML] :: Integer
+              [LclId[JoinId(0)(Nothing)]]
+              $j1 = GHC.Internal.Bignum.Integer.integerAdd (F11.f11_fib (GHC.Internal.Bignum.Integer.integerSub ds F11.fib3)) (F11.f11_fib (GHC.Internal.Bignum.Integer.integerSub ds F11.fib2)) } in
+            case ds of {
+              GHC.Internal.Bignum.Integer.IS x ->
+                case x of {
+                  __DEFAULT -> jump $j1;
+                  1# -> F11.fib3
+                };
+              GHC.Internal.Bignum.Integer.IP x -> jump $j1;
+              GHC.Internal.Bignum.Integer.IN x -> jump $j1
+            } } in
+      case ds of {
+        GHC.Internal.Bignum.Integer.IS x ->
+          case x of {
+            __DEFAULT -> jump $j;
+            0# -> F11.fib3
+          };
+        GHC.Internal.Bignum.Integer.IP x -> jump $j;
+        GHC.Internal.Bignum.Integer.IN x -> jump $j
+      }
+end Rec }
+
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.f3 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F11.f3 = GHC.Num.Integer.IS 1000#
+F11.f3 = GHC.Internal.Bignum.Integer.IS 1000#
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.f11_x :: Integer
@@ -100,7 +100,7 @@ F11.f11_x = F11.f11_fib F11.f3
 -- RHS size: {terms: 4, types: 1, coercions: 0, joins: 0/0}
 F11.f11f1 :: Integer -> Integer
 [GblId, Arity=1, Str=<SL>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 30 0}]
-F11.f11f1 = \ (y :: Integer) -> GHC.Num.Integer.integerAdd F11.f11_x y
+F11.f11f1 = \ (y :: Integer) -> GHC.Internal.Bignum.Integer.integerAdd F11.f11_x y
 
 -- RHS size: {terms: 3, types: 2, coercions: 0, joins: 0/0}
 f11f :: forall {p}. p -> Integer -> Integer
@@ -110,22 +110,22 @@ f11f = \ (@p) _ [Occ=Dead] -> F11.f11f1
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.f5 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F11.f5 = GHC.Num.Integer.IS 6#
+F11.f5 = GHC.Internal.Bignum.Integer.IS 6#
 
 -- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
 F11.f4 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False, WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 30 0}]
-F11.f4 = GHC.Num.Integer.integerAdd F11.f11_x F11.f5
+F11.f4 = GHC.Internal.Bignum.Integer.integerAdd F11.f11_x F11.f5
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F11.f2 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F11.f2 = GHC.Num.Integer.IS 8#
+F11.f2 = GHC.Internal.Bignum.Integer.IS 8#
 
 -- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
 F11.f1 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False, WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 30 0}]
-F11.f1 = GHC.Num.Integer.integerAdd F11.f11_x F11.f2
+F11.f1 = GHC.Internal.Bignum.Integer.integerAdd F11.f11_x F11.f2
 
 -- RHS size: {terms: 3, types: 2, coercions: 0, joins: 0/0}
 f11 :: (Integer, Integer)
@@ -133,7 +133,4 @@ f11 :: (Integer, Integer)
 f11 = (F11.f4, F11.f1)
 
 
------- Local rules for imported ids --------
-"SPEC fib @Integer @Integer" forall ($dEq :: Eq Integer) ($dNum :: Num Integer) ($dNum1 :: Num Integer). fib @Integer @Integer $dEq $dNum $dNum1 = F11.f11_fib
-
 


=====================================
testsuite/tests/arityanal/should_compile/Arity14.stderr
=====================================
@@ -3,18 +3,18 @@
 Result size of Tidy Core = {terms: 44, types: 38, coercions: 0, joins: 0/3}
 
 -- RHS size: {terms: 3, types: 2, coercions: 0, joins: 0/0}
-F14.f1 :: forall {t}. t -> t
+F14.f1 :: forall t. t -> t
 [GblId, Arity=1, Str=<1L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=True)}]
 F14.f1 = \ (@t) (y :: t) -> y
 
 -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
 F14.f2 :: Integer
 [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 10}]
-F14.f2 = GHC.Num.Integer.IS 1#
+F14.f2 = GHC.Internal.Bignum.Integer.IS 1#
 
 -- RHS size: {terms: 36, types: 23, coercions: 0, joins: 0/3}
 f14 :: forall {t}. (Ord t, Num t) => t -> t -> t -> t
-[GblId, Arity=4, Str=<L><L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [30 90 0 0] 310 0}]
+[GblId, Arity=4, Str=<L><L>, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=IF_ARGS [90 270 0 0] 310 0}]
 f14
   = \ (@t) ($dOrd :: Ord t) ($dNum :: Num t) (eta :: t) (eta1 :: t) ->
       let {
@@ -25,7 +25,7 @@ f14
         f3 [Occ=LoopBreaker, Dmd=SC(S,C(1,L))] :: t -> t -> t -> t
         [LclId, Arity=2, Str=<L><L>, Unf=OtherCon []]
         f3
-          = \ (n :: t) (x :: t) ->
+          = \ (n :: t) (x [OS=OneShot] :: t) ->
               case < @t $dOrd x n of {
                 False -> F14.f1 @t;
                 True ->


=====================================
testsuite/tests/perf/compiler/T13960.hs
=====================================
@@ -0,0 +1,72 @@
+{-# LANGUAGE OverloadedStrings #-}
+
+-- GHC used to run out of simplifier ticks due to inlining the internals of
+-- `toStrict . toLazyByteString`.
+module T13960 (breaks) where
+
+import Data.ByteString (ByteString)
+import Data.ByteString.Builder (Builder, stringUtf8, toLazyByteString)
+import Data.ByteString.Lazy (toStrict)
+import Data.String (IsString(..))
+
+newtype Query = Query ByteString
+
+toByteString :: Builder -> ByteString
+toByteString x = toStrict (toLazyByteString x)
+
+instance IsString Query where
+  fromString = Query . toByteString . stringUtf8
+
+breaks :: [(Query, Query)]
+breaks =
+  [ ("query001a", "query001b")
+  , ("query002a", "query002b")
+  , ("query003a", "query003b")
+  , ("query004a", "query004b")
+  , ("query005a", "query005b")
+  , ("query006a", "query006b")
+  , ("query007a", "query007b")
+  , ("query008a", "query008b")
+  , ("query009a", "query009b")
+  , ("query010a", "query010b")
+  , ("query011a", "query011b")
+  , ("query012a", "query012b")
+  , ("query013a", "query013b")
+  , ("query014a", "query014b")
+  , ("query015a", "query015b")
+  , ("query016a", "query016b")
+  , ("query017a", "query017b")
+  , ("query018a", "query018b")
+  , ("query019a", "query019b")
+  , ("query020a", "query020b")
+  , ("query021a", "query021b")
+  , ("query022a", "query022b")
+  , ("query023a", "query023b")
+  , ("query024a", "query024b")
+  , ("query025a", "query025b")
+  , ("query026a", "query026b")
+  , ("query027a", "query027b")
+  , ("query028a", "query028b")
+  , ("query029a", "query029b")
+  , ("query030a", "query030b")
+  , ("query031a", "query031b")
+  , ("query032a", "query032b")
+  , ("query033a", "query033b")
+  , ("query034a", "query034b")
+  , ("query035a", "query035b")
+  , ("query036a", "query036b")
+  , ("query037a", "query037b")
+  , ("query038a", "query038b")
+  , ("query039a", "query039b")
+  , ("query040a", "query040b")
+  , ("query041a", "query041b")
+  , ("query042a", "query042b")
+  , ("query043a", "query043b")
+  , ("query044a", "query044b")
+  , ("query045a", "query045b")
+  , ("query046a", "query046b")
+  , ("query047a", "query047b")
+  , ("query048a", "query048b")
+  , ("query049a", "query049b")
+  , ("query050a", "query050b")
+  ]


=====================================
testsuite/tests/perf/compiler/all.T
=====================================
@@ -686,6 +686,12 @@ test ('T13820',
       ],
       compile,
       ['-v0'])
+test ('T13960',
+      [ collect_compiler_stats('peak_megabytes_allocated', 20),
+        collect_compiler_stats('bytes allocated', 2),
+      ],
+      compile,
+      ['-O'])
 test ('T14766',
       [ collect_compiler_stats('bytes allocated',2),
         pre_cmd('python3 genT14766.py > T14766.hs'),


=====================================
testsuite/tests/simplCore/should_compile/T15205.stderr
=====================================
@@ -10,7 +10,7 @@ f :: forall a b. C a b => a -> b
  Str=<1P(A,1C(1,C(1,L)))><L>,
  Unf=Unf{Src=<vanilla>, TopLvl=True,
          Value=True, ConLike=True, WorkFree=True, Expandable=True,
-         Guidance=IF_ARGS [30 0] 40 0}]
+         Guidance=IF_ARGS [90 0] 40 0}]
 f = \ (@a) (@b) ($dC :: C a b) (x :: a) -> op @a @b $dC x x
 
 


=====================================
testsuite/tests/wasm/should_run/control-flow/LoadCmmGroup.hs
=====================================
@@ -91,12 +91,17 @@ stgify :: ModSummary -> ModGuts -> Ghc [StgTopBinding]
 stgify summary guts = do
     hsc_env <- getSession
     let dflags = hsc_dflags hsc_env
-    prepd_binds <- liftIO $ do
+    liftIO $ do
       cp_cfg <- initCorePrepConfig hsc_env
-      corePrepPgm (hsc_logger hsc_env) cp_cfg (initCorePrepPgmConfig dflags (interactiveInScope $ hsc_IC hsc_env)) this_mod core_binds
-    return $ fstOf3 $ coreToStg (initCoreToStgOpts dflags) (ms_mod summary) (ms_location summary) prepd_binds
-  where this_mod = mg_module guts
-        core_binds = mg_binds guts
+      prepd_binds <- corePrepPgm (hsc_logger hsc_env) cp_cfg
+                       (initCorePrepPgmConfig dflags (interactiveInScope $ hsc_IC hsc_env))
+                       this_mod core_binds
+      (binds, _, _) <- coreToStg (initCoreToStgOpts dflags) (ms_mod summary)
+                                 (ms_location summary) prepd_binds
+      return binds
+  where
+    this_mod = mg_module guts
+    core_binds = mg_binds guts
 
 slurpCmm :: HscEnv -> FilePath -> IO (CmmGroup)
 slurpCmm hsc_env filename = runHsc hsc_env $ do



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d4a4055f8dc1b5b4a2acc4260cdd0e6...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/d4a4055f8dc1b5b4a2acc4260cdd0e6...
You're receiving this email because of your account on gitlab.haskell.org.

    

Marge Bot (＠marge-bot)

tags

participants (1)