[Git][ghc/ghc][wip/T23109a] 3 commits: Comments only

18 Apr 2025

Simon Peyton Jones pushed to branch wip/T23109a at Glasgow Haskell Compiler / GHC


Commits:
b10529e9 by Simon Peyton Jones at 2025-04-18T15:27:53+01:00
Comments only

- - - - -
e805aa2e by Simon Peyton Jones at 2025-04-18T15:35:54+01:00
Refator GHC.Core.Opt.SetLevels.notWorthFloating

I refactored `notWorthFloating` while I was doing something else.

I don't think there's a change in behaviour, but if so it's very much
a corner case.

- - - - -
d6c13d5d by Simon Peyton Jones at 2025-04-18T17:16:13+01:00
Always float bottoming expressions to the top

...regardless of floatConsts

- - - - -


2 changed files:

- compiler/GHC/Core/Opt/SetLevels.hs
- compiler/GHC/Core/Opt/Simplify/Iteration.hs


Changes:

=====================================
compiler/GHC/Core/Opt/SetLevels.hs
=====================================
@@ -482,14 +482,14 @@ Consider this:
   f :: T Int -> blah
   f x vs = case x of { MkT y ->
              let f vs = ...(case y of I# w -> e)...f..
-             in f vs
+             in f vs }
 
 Here we can float the (case y ...) out, because y is sure
 to be evaluated, to give
   f x vs = case x of { MkT y ->
-           case y of I# w ->
+           case y of { I# w ->
              let f vs = ...(e)...f..
-             in f vs
+             in f vs }}
 
 That saves unboxing it every time round the loop.  It's important in
 some DPH stuff where we really want to avoid that repeated unboxing in
@@ -614,7 +614,8 @@ lvlMFE env strict_ctxt e@(_, AnnCase {})
   = lvlExpr env e     -- See Note [Case MFEs]
 
 lvlMFE env strict_ctxt ann_expr
-  | not float_me
+  |  notWorthFloating expr abs_vars
+  || not float_me
   || floatTopLvlOnly env && not (isTopLvl dest_lvl)
          -- Only floating to the top level is allowed.
   || hasFreeJoin env fvs   -- If there is a free join, don't float
@@ -623,9 +624,6 @@ lvlMFE env strict_ctxt ann_expr
          -- We can't let-bind an expression if we don't know
          -- how it will be represented at runtime.
          -- See Note [Representation polymorphism invariants] in GHC.Core
-  || notWorthFloating expr abs_vars
-         -- Test notWorhtFloating last;
-         -- See Note [Large free-variable sets]
   = -- Don't float it out
     lvlExpr env ann_expr
 
@@ -676,12 +674,11 @@ lvlMFE env strict_ctxt ann_expr
     is_function  = isFunction ann_expr
     mb_bot_str   = exprBotStrictness_maybe expr
                            -- See Note [Bottoming floats]
-                           -- esp Bottoming floats (2)
+                           -- esp Bottoming floats (BF2)
     expr_ok_for_spec = exprOkForSpeculation expr
     abs_vars = abstractVars dest_lvl env fvs
     dest_lvl = destLevel env fvs fvs_ty is_function is_bot_lam
-               -- NB: is_bot_lam not is_bot; see (3) in
-               --     Note [Bottoming floats]
+               -- NB: is_bot_lam not is_bot; see (BF2) in Note [Bottoming floats]
 
     -- float_is_new_lam: the floated thing will be a new value lambda
     -- replacing, say (g (x+4)) by (lvl x).  No work is saved, nor is
@@ -698,20 +695,22 @@ lvlMFE env strict_ctxt ann_expr
 
         -- A decision to float entails let-binding this thing, and we only do
         -- that if we'll escape a value lambda, or will go to the top level.
+        -- Never float trivial expressions;
+        --   notably, save_work might be true of a lone evaluated variable.
     float_me = saves_work || saves_alloc || is_mk_static
 
     -- See Note [Saving work]
+    is_hnf = exprIsHNF expr
     saves_work = escapes_value_lam        -- (a)
-                 && not (exprIsHNF expr)  -- (b)
+                 && not is_hnf            -- (b)
                  && not float_is_new_lam  -- (c)
     escapes_value_lam = dest_lvl `ltMajLvl` (le_ctxt_lvl env)
 
-    -- See Note [Saving allocation] and Note [Floating to the top]
-    saves_alloc =  isTopLvl dest_lvl
-                && floatConsts env
-                && (   not strict_ctxt                     -- (a)
-                    || exprIsHNF expr                      -- (b)
-                    || (is_bot_lam && escapes_value_lam))  -- (c)
+    -- See Note [Floating to the top]
+    saves_alloc = isTopLvl dest_lvl
+               && (  (floatConsts env &&
+                       (not strict_ctxt || is_hnf))     -- (FT1) and (FT2)
+                  || (is_bot_lam && escapes_value_lam)) -- (FT3)
 
 hasFreeJoin :: LevelEnv -> DVarSet -> Bool
 -- Has a free join point which is not being floated to top level.
@@ -726,7 +725,7 @@ hasFreeJoin env fvs
 The key idea in let-floating is to
   * float a redex out of a (value) lambda
 Doing so can save an unbounded amount of work.
-But see also Note [Saving allocation].
+But see also Note [Floating to the top].
 
 So we definitely float an expression out if
 (a) It will escape a value lambda (escapes_value_lam)
@@ -771,10 +770,12 @@ Wrinkles:
       we have saved nothing: one pair will still be allocated for each
       call of `f`.  Hence the (not float_is_new_lam) in saves_work.
 
-Note [Saving allocation]
-~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Floating to the top]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
 Even if `saves_work` is false, we we may want to float even cheap/HNF
-expressions out of value lambdas, for several reasons:
+expressions out of value lambdas. Data suggests, however, that it is better
+/only/ to do so, /if/ they can go to top level. If the expression goes to top
+level we don't pay the cost of allocating cold-path thunks described in (SW2).
 
 * Doing so may save allocation. Consider
         f = \x.  .. (\y.e) ...
@@ -782,6 +783,11 @@ expressions out of value lambdas, for several reasons:
   (assuming e does not mention x). An example where this really makes a
   difference is simplrun009.
 
+* In principle this would be true even if the (\y.e) didn't go to top level; but
+  in practice we only float a HNF if it goes all way to the top.  We don't pay
+  /any/ allocation cost for a top-level floated expression; it just becomes
+  static data.
+
 * It may allow SpecContr to fire on functions. Consider
         f = \x. ....(f (\y.e))....
   After floating we get
@@ -793,21 +799,7 @@ expressions out of value lambdas, for several reasons:
   a big difference for string literals and bottoming expressions: see Note
   [Floating to the top]
 
-Data suggests, however, that it is better /only/ to float HNFS, /if/ they can go
-to top level. See (SW2) of Note [Saving work].  If the expression goes to top
-level we don't pay the cost of allocating cold-path thunks described in (SW2).
-
-Hence `isTopLvl dest_lvl` in `saves_alloc`.
-
-Note [Floating to the top]
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-Even though Note [Saving allocation] suggests that we should not, in
-general, float HNFs, the balance change if it goes to the top:
-
-* We don't pay an allocation cost for the floated expression; it
-  just becomes static data.
-
-* Floating string literal is valuable -- no point in duplicating the
+* Floating string literals is valuable -- no point in duplicating the
   at each call site!
 
 * Floating bottoming expressions is valuable: they are always cold
@@ -815,32 +807,32 @@ general, float HNFs, the balance change if it goes to the top:
   can be quite big, inhibiting inlining. See Note [Bottoming floats]
 
 So we float an expression to the top if:
-  (a) the context is lazy (so we get allocation), or
-  (b) the expression is a HNF (so we get allocation), or
-  (c) the expression is bottoming and floating would escape a
-      value lambda (NB: if the expression itself is a lambda, (b)
-      will apply; so this case only catches bottoming thunks)
+  (FT1) the context is lazy (so we get allocation), or
+  (FT2) the expression is a HNF (so we get allocation), or
+  (FT3) the expression is bottoming and floating would escape a
+         value lambda (NB: if the expression itself is a lambda, (b)
+         will apply; so this case only catches bottoming thunks)
 
 Examples:
 
-* (a) Strict.  Case scrutinee
+* (FT1) Strict.  Case scrutinee
       f = case g True of ....
   Don't float (g True) to top level; then we have the admin of a
   top-level thunk to worry about, with zero gain.
 
-* (a) Strict.  Case alternative
+* (FT1) Strict.  Case alternative
       h = case y of
              True  -> g True
              False -> False
   Don't float (g True) to the top level
 
-* (b) HNF
+* (FT2) HNF
       f = case y of
             True  -> p:q
             False -> blah
   We may as well float the (p:q) so it becomes a static data structure.
 
-* (c) Bottoming expressions; see also Note [Bottoming floats]
+* (FT3) Bottoming expressions; see also Note [Bottoming floats]
       f x = case x of
               0 -> error <big thing>
               _ -> x+1
@@ -853,7 +845,7 @@ Examples:
   'foo' anyway.  So float bottoming things only if they escape
   a lambda.
 
-* Arguments
+* (FT4) Arguments
      t = f (g True)
   Prior to Apr 22 we didn't float (g True) to the top if f was strict.
   But (a) this only affected CAFs, because if it escapes a value lambda
@@ -868,28 +860,6 @@ early loses opportunities for RULES which (needless to say) are
 important in some nofib programs (gcd is an example).  [SPJ note:
 I think this is obsolete; the flag seems always on.]
 
-Note [Large free-variable sets]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In #24471 we had something like
-     x1 = I# 1
-     ...
-     x1000 = I# 1000
-     foo = f x1 (f x2 (f x3 ....))
-So every sub-expression in `foo` has lots and lots of free variables.  But
-none of these sub-expressions float anywhere; the entire float-out pass is a
-no-op.
-
-In lvlMFE, we want to find out quickly if the MFE is not-floatable; that is
-the common case.  In #24471 it turned out that we were testing `abs_vars` (a
-relatively complicated calculation that takes at least O(n-free-vars) time to
-compute) for every sub-expression.
-
-Better instead to test `float_me` early. That still involves looking at
-dest_lvl, which means looking at every free variable, but the constant factor
-is a lot better.
-
-ToDo: find a way to fix the bad asymptotic complexity.
-
 Note [Floating join point bindings]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Mostly we don't float join points at all -- we want them to /stay/ join points.
@@ -1053,30 +1023,36 @@ we'd like to float the call to error, to get
 
 But, as ever, we need to be careful:
 
-(1) We want to float a bottoming
+(BF1) We want to float a bottoming
     expression even if it has free variables:
         f = \x. g (let v = h x in error ("urk" ++ v))
     Then we'd like to abstract over 'x', and float the whole arg of g:
         lvl = \x. let v = h x in error ("urk" ++ v)
         f = \x. g (lvl x)
-    To achieve this we pass is_bot to destLevel
-
-(2) We do not do this for lambdas that return
-    bottom.  Instead we treat the /body/ of such a function specially,
-    via point (1).  For example:
+    To achieve this we pass `is_bot` to destLevel
+
+(BF2) We do the same for /lambdas/ that return bottom.
+    Suppose the original lambda had /no/ free vars:
+        f = \x. ....(\y z. error (y++z))...
+    then we'd like to float that whole lambda
+        lvl = \y z. error (y++z)
+        f = \x. ....lvl....
+    If we just floated its bottom-valued body, we might abstract the arguments in
+    the "wrong" order and end up with this bad result
+        lvl = \z y. error (y++z)
+        f = \x. ....(\y z. lvl z y)....
+
+    If the lambda does have free vars, this will happen:
         f = \x. ....(\y z. if x then error y else error z)....
-    If we float the whole lambda thus
+    We float the whole lambda thus
         lvl = \x. \y z. if x then error y else error z
         f = \x. ...(lvl x)...
-    we may well end up eta-expanding that PAP to
+    And we may well end up eta-expanding that PAP to
+        lvl = \x. \y z. if b then error y else error z
         f = \x. ...(\y z. lvl x y z)...
+    so we get a (small) closure.  So be it.
 
-    ===>
-        lvl = \x z y. if b then error y else error z
-        f = \x. ...(\y z. lvl x z y)...
-    (There is no guarantee that we'll choose the perfect argument order.)
-
-(3) If we have a /binding/ that returns bottom, we want to float it to top
+(BF3) If we have a /binding/ that returns bottom, we want to float it to top
     level, even if it has free vars (point (1)), and even it has lambdas.
     Example:
        ... let { v = \y. error (show x ++ show y) } in ...
@@ -1092,7 +1068,6 @@ But, as ever, we need to be careful:
     join points (#24768), and floating to the top would abstract over those join
     points, which we should never do.
 
-
 See Maessen's paper 1999 "Bottom extraction: factoring error handling out
 of functional programs" (unpublished I think).
 
@@ -1135,7 +1110,6 @@ float the case (as advocated here) we won't float the (build ...y..)
 either, so fusion will happen.  It can be a big effect, esp in some
 artificial benchmarks (e.g. integer, queens), but there is no perfect
 answer.
-
 -}
 
 annotateBotStr :: Id -> Arity -> Maybe (Arity, DmdSig, CprSig) -> Id
@@ -1152,69 +1126,124 @@ annotateBotStr id n_extra mb_bot_str
   = id
 
 notWorthFloating :: CoreExpr -> [Var] -> Bool

--- Returns True if the expression would be replaced by
--- something bigger than it is now.  For example:
---   abs_vars = tvars only:  return True if e is trivial,
---                           but False for anything bigger
---   abs_vars = [x] (an Id): return True for trivial, or an application (f x)
---                           but False for (f x x)
---
--- One big goal is that floating should be idempotent.  Eg if
--- we replace e with (lvl79 x y) and then run FloatOut again, don't want
--- to replace (lvl79 x y) with (lvl83 x y)!
-
+--  See Note [notWorthFloating]
 notWorthFloating e abs_vars
-  = go e (count isId abs_vars)
+  = go e 0
   where
-    go (Var {}) n               = n >= 0
-    go (Lit lit) n              = assert (n==0) $
-                                  litIsTrivial lit   -- Note [Floating literals]
-    go (Type {}) _              = True
-    go (Coercion {}) _          = True
+    n_abs_vars = count isId abs_vars  -- See (NWF5)
+
+    go :: CoreExpr -> Int -> Bool
+    -- (go e n) return True if (e x1 .. xn) is not worth floating
+    -- where `e` has n trivial value arguments x1..xn
+    -- See (NWF4)
+    go (Lit lit) n         = assert (n==0) $
+                             litIsTrivial lit   -- See (NWF1)
+    go (Type {}) _         = True
+    go (Tick t e) n        = not (tickishIsCode t) && go e n
+    go (Cast e _) n        = n==0 || go e n     -- See (NWF3)
+    go (Coercion {}) _     = True
     go (App e arg) n
-       -- See Note [Floating applications to coercions]
-       | not (isRuntimeArg arg) = go e n
-       | n==0                   = False
-       | exprIsTrivial arg      = go e (n-1) -- NB: exprIsTrivial arg = go arg 0
-       | otherwise              = False
-    go (Tick t e) n             = not (tickishIsCode t) && go e n
-    go (Cast e _)  n            = go e n
-    go (Case e b _ as) n
+       | Type {} <- arg    = go e n    -- Just types, not coercions (NWF2)
+       | exprIsTrivial arg = go e (n+1)
+       | otherwise         = False  -- (f non-triv) is worth floating
+
+    go (Case e b _ as) _
+      -- Do not float the `case` part of trivial cases (NWF3)
+      -- We'll have a look at the RHS when we get there
       | null as
-      = go e n     -- See Note [Empty case is trivial]
-      | Just rhs <- isUnsafeEqualityCase e b as
-      = go rhs n   -- See (U2) of Note [Implementing unsafeCoerce] in base:Unsafe.Coerce
-    go _ _                      = False
+      = True   -- See Note [Empty case is trivial]
+      | Just {} <- isUnsafeEqualityCase e b as
+      = True   -- See (U2) of Note [Implementing unsafeCoerce] in base:Unsafe.Coerce
+      | otherwise
+      = False
 
-{-
-Note [Floating literals]
-~~~~~~~~~~~~~~~~~~~~~~~~
-It's important to float Integer literals, so that they get shared,
-rather than being allocated every time round the loop.
-Hence the litIsTrivial.
+    go (Var _) n
+      | n==0             = True   -- Naked variable
+      | n <= n_abs_vars  = True   -- (f a b c) is not worth floating if
+      | otherwise        = False  -- a,b,c are all abstracted; see (NWF5)
 
-Ditto literal strings (LitString), which we'd like to float to top
-level, which is now possible.
+    go _ _ = False  -- Let etc is worth floating
 
-Note [Floating applications to coercions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We don’t float out variables applied only to type arguments, since the
-extra binding would be pointless: type arguments are completely erased.
-But *coercion* arguments aren’t (see Note [Coercion tokens] in
-"GHC.CoreToStg" and Note [inlineBoringOk] in"GHC.Core.Unfold"),
-so we still want to float out variables applied only to
-coercion arguments.
+{- Note [notWorthFloating]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+`notWorthFloating` returns True if the expression would be replaced by something
+bigger than it is now.  One big goal is that floating should be idempotent.  Eg
+if we replace e with (lvl79 x y) and then run FloatOut again, don't want to
+replace (lvl79 x y) with (lvl83 x y)!
 
+For example:
+  abs_vars = tvars only:  return True if e is trivial,
+                          but False for anything bigger
+  abs_vars = [x] (an Id): return True for trivial, or an application (f x)
+                          but False for (f x x)
+
+(NWF1) It's important to float Integer literals, so that they get shared, rather
+  than being allocated every time round the loop.  Hence the litIsTrivial.
+
+  Ditto literal strings (LitString), which we'd like to float to top
+  level, which is now possible.
+
+(NWF2) We don’t float out variables applied only to type arguments, since the
+  extra binding would be pointless: type arguments are completely erased.
+  But *coercion* arguments aren’t (see Note [Coercion tokens] in
+  "GHC.CoreToStg" and Note [inlineBoringOk] in"GHC.Core.Unfold"),
+  so we still want to float out variables applied only to
+  coercion arguments.
+
+(NWF3) Some expressions have trivial wrappers:
+     - Casts (e |> co)
+     - Unary-class applications:
+          - Dictionary applications (MkC meth)
+          - Class-op applictions    (op dict)
+     - Case of empty alts
+     - Unsafe-equality case
+  In all these cases we say "not worth floating", and we do so /regardless/
+  of the wrapped expression.  The SetLevels stuff may subsequently float the
+  components of the expression.
+
+  Example:  is it worth floating (f x |> co)?  No!  If we did we'd get
+     lvl = f x |> co
+     ...lvl....
+  Then we'd do cast worker/wrapper and end up with.
+     lvl' = f x
+     ...(lvl' |> co)...
+  Silly!  Better not to float it in the first place.  If we say "no" here,
+  we'll subsequently say "yes" for (f x) and get
+     lvl = f x
+     ....(lvl |> co)...
+  which is what we want.  In short: don't float trivial wrappers.
+
+(NWF4) The only non-trivial expression that we say "not worth floating" for
+  is an application
+             f x y z
+  where the number of value arguments is <= the number of abstracted Ids.
+  This is what makes floating idempotent.  Hence counting the number of
+  value arguments in `go`
+
+(NWF5) In #24471 we had something like
+     x1 = I# 1
+     ...
+     x1000 = I# 1000
+     foo = f x1 (f x2 (f x3 ....))
+  So every sub-expression in `foo` has lots and lots of free variables.  But
+  none of these sub-expressions float anywhere; the entire float-out pass is a
+  no-op.
 
-************************************************************************
-*                                                                      *
-\subsection{Bindings}
-*                                                                      *
-************************************************************************
+  So `notWorthFloating` tries to avoid evaluating `n_abs_vars`, in cases where
+  it obviously /is/ worth floating.  (In #24471 it turned out that we were
+  testing `abs_vars` (a relatively complicated calculation that takes at least
+  O(n-free-vars) time to compute) for every sub-expression.)
 
-The binding stuff works for top level too.
+  Hence testing `n_abs_vars only` at the very end.
 -}
 
+{- *********************************************************************
+*                                                                      *
+                       Bindings
+        This binding stuff works for top level too.
+*                                                                      *
+********************************************************************* -}
+
 lvlBind :: LevelEnv
         -> CoreBindWithFVs
         -> LvlM (LevelledBind, LevelEnv)
@@ -1261,7 +1290,7 @@ lvlBind env (AnnNonRec bndr rhs)
         -- is_bot_lam: looks like (\xy. bot), maybe zero lams
         -- NB: not isBottomThunk!
         -- NB: not is_join: don't send bottoming join points to the top.
-        -- See Note [Bottoming floats] point (3)
+        -- See Note [Bottoming floats] (BF3)
 
     is_top_bindable = exprIsTopLevelBindable deann_rhs bndr_ty
     n_extra       = count isId abs_vars
@@ -1552,9 +1581,8 @@ destLevel env fvs fvs_ty is_function is_bot
                               -- See Note [Floating join point bindings]
   = tOP_LEVEL
 
-  | is_bot              -- Send bottoming bindings to the top
-  = as_far_as_poss      -- regardless; see Note [Bottoming floats]
-                        -- Esp Bottoming floats (1) and (3)
+  | is_bot              -- Send bottoming bindings to the top regardless;
+  = as_far_as_poss      -- see (BF1) and (BF2) in Note [Bottoming floats]
 
   | Just n_args <- floatLams env
   , n_args > 0  -- n=0 case handled uniformly by the 'otherwise' case
@@ -1568,8 +1596,13 @@ destLevel env fvs fvs_ty is_function is_bot
     max_fv_id_level = maxFvLevel isId env fvs -- Max over Ids only; the
                                               -- tyvars will be abstracted
 
+    -- as_far_as_poss: destination level depends only on the free Ids (more
+    -- precisely, free CoVars) of the /type/, not the free Ids of the /term/.
+    -- Why worry about the free CoVars?  See Note [Floating and kind casts]
+    --
+    -- There may be free Ids in the term, but then we'll just
+    -- lambda-abstract over them
     as_far_as_poss = maxFvLevel' isId env fvs_ty
-                     -- See Note [Floating and kind casts]
 
 {- Note [Floating and kind casts]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1732,10 +1765,9 @@ maxFvLevel max_me env var_set
     -- It's OK to use a non-deterministic fold here because maxIn commutes.
 
 maxFvLevel' :: (Var -> Bool) -> LevelEnv -> TyCoVarSet -> Level
--- Same but for TyCoVarSet
+-- Precisely the same as `maxFvLevel` but for TyCoVarSet rather than DVarSet
 maxFvLevel' max_me env var_set
   = nonDetStrictFoldUniqSet (maxIn max_me env) tOP_LEVEL var_set
-    -- It's OK to use a non-deterministic fold here because maxIn commutes.
 
 maxIn :: (Var -> Bool) -> LevelEnv -> InVar -> Level -> Level
 maxIn max_me (LE { le_lvl_env = lvl_env, le_env = id_env }) in_var lvl


=====================================
compiler/GHC/Core/Opt/Simplify/Iteration.hs
=====================================
@@ -801,9 +801,9 @@ makeTrivial env top_lvl dmd occ_fs expr
   = return (emptyLetFloats, expr)
 
   | not (bindingOk top_lvl expr expr_ty)  -- Cannot trivialise
-  = return (emptyLetFloats, expr)         --   See Note [Cannot trivialise]
+  = return (emptyLetFloats, expr)         -- See Note [Cannot trivialise]
 
-  | otherwise -- 'expr' is not of form (Cast e co)
+  | otherwise
   = do  { (floats, expr1) <- prepareRhs env top_lvl occ_fs expr
         ; uniq <- getUniqueM
         ; let name = mkSystemVarName uniq occ_fs



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bc27b6c9b536a8200cd2b8750e4744f...

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/bc27b6c9b536a8200cd2b8750e4744f...
You're receiving this email because of your account on gitlab.haskell.org.

    

[Git][ghc/ghc][wip/T23109a] 3 commits: Comments only

Simon Peyton Jones (＠simonpj)