GitLab

Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC

Commits:

48a3ed57

by Simon Peyton Jones at 2025-11-25T15:33:54+00:00

Add a fast-path for args=[] to occAnalApp

In the common case of having not arguments, occAnalApp
was doing redundant work.

951e5ed9

by Simon Peyton Jones at 2025-11-25T15:33:54+00:00

Fix a performance hole in the occurrence analyser

As #26425 showed, the clever stuff in

   Note [Occurrence analysis for join points]

does a lot of duplication of usage details.  This patch
improved matters with a little fancy footwork.  It is
described in the new (W4) of the same Note.

Compile-time allocations go down slightly. Here are the changes
of +/- 0.5% or more:

    T13253(normal)    329,369,244    326,395,544  -0.9%
T13253-spj(normal)     66,410,496     66,095,864  -0.5%
    T15630(normal)    129,797,200    128,663,136  -0.9%
   T15630a(normal)    129,212,408    128,027,560  -0.9%
    T16577(normal)  6,756,706,896  6,723,028,512  -0.5%
    T18282(normal)    128,462,070    125,808,584  -2.1% GOOD
   T18698a(normal)    208,418,305    202,037,336  -3.1% GOOD
    T18730(optasm)    136,981,756    136,208,136  -0.6%
    T18923(normal)     58,103,088     57,745,840  -0.6%
    T19695(normal)  1,386,306,272  1,365,609,416  -1.5%
    T26425(normal)  3,344,402,957  2,457,811,664 -26.5% GOOD
     T6048(optasm)     79,763,816     79,212,760  -0.7%
     T9020(optasm)    225,278,408    223,682,440  -0.7%
     T9961(normal)    303,810,717    300,729,168  -1.0% GOOD

         geo. mean                                -0.5%
         minimum                                 -26.5%
         maximum                                  +0.4%

Metric Decrease:
    T18282
    T18698a
    T26425
    T9961

f1959dfc

by Simon Peyton Jones at 2025-11-26T11:58:07+00:00

Remove a quadratic-cost assertion check in mkCoreApp

See the new Note [Assertion checking in mkCoreApp]

10 changed files:

compiler/GHC/Core/Make.hs
compiler/GHC/Core/Opt/OccurAnal.hs
compiler/GHC/HsToCore/Binds.hs
compiler/GHC/HsToCore/Expr.hs
compiler/GHC/HsToCore/Match.hs
compiler/GHC/Types/Basic.hs
compiler/GHC/Types/Unique/FM.hs
compiler/GHC/Types/Unique/Set.hs
compiler/GHC/Types/Var/Env.hs
testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs

Changes:

compiler/GHC/Core/Make.hs

@@ -151,37 +151,28 @@ mkCoreConWrapApps con args = mkCoreApps (Var (dataConWrapId con)) args
  -- | Construct an expression which represents the application of a number of
  -- expressions to another. The leftmost expression in the list is applied first
 -mkCoreApps :: CoreExpr -- ^ function
 +-- See Note [Assertion checking in mkCoreApp]
 +mkCoreApps :: CoreExpr   -- ^ function
             -> [CoreExpr] -- ^ arguments
             -> CoreExpr
 -mkCoreApps fun args
 -  = fst $
 -    foldl' (mkCoreAppTyped doc_string) (fun, fun_ty) args
 -  where
 -    doc_string = ppr fun_ty $$ ppr fun $$ ppr args
 -    fun_ty = exprType fun
 +mkCoreApps fun args = foldl' mkCoreApp fun args
  -- | Construct an expression which represents the application of one expression
  -- to the other
 -mkCoreApp :: SDoc
 -          -> CoreExpr -- ^ function
 +-- See Note [Assertion checking in mkCoreApp]
 +mkCoreApp :: CoreExpr -- ^ function
            -> CoreExpr -- ^ argument
            -> CoreExpr
 -mkCoreApp s fun arg
 -  = fst $ mkCoreAppTyped s (fun, exprType fun) arg
+-
 --- | Construct an expression which represents the application of one expression
 --- paired with its type to an argument. The result is paired with its type. This
 --- function is not exported and used in the definition of 'mkCoreApp' and
 --- 'mkCoreApps'.
 -mkCoreAppTyped :: SDoc -> (CoreExpr, Type) -> CoreExpr -> (CoreExpr, Type)
 -mkCoreAppTyped _ (fun, fun_ty) (Type ty)
 -  = (App fun (Type ty), piResultTy fun_ty ty)
 -mkCoreAppTyped _ (fun, fun_ty) (Coercion co)
 -  = (App fun (Coercion co), funResultTy fun_ty)
 -mkCoreAppTyped d (fun, fun_ty) arg
 -  = assertPpr (isFunTy fun_ty) (ppr fun $$ ppr arg $$ d)
 -    (App fun arg, funResultTy fun_ty)
 +mkCoreApp fun arg = App fun arg
++
 +{- Note [Assertion checking in mkCoreApp]
 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 +At one time we had an assertion to check that the function and argument type match up,
 +but that turned out to take 90% of all compile time (!) when compiling test
 +`unboxedsums/UbxSumUnpackedSize.hs`. The reason was an unboxed sum constructor with
 +hundreds of foralls.   It's most straightforward just to remove the assert, and
 +rely on Lint to discover any mis-constructed terms.
 +-}
  {- *********************************************************************
  *                                                                      *

compiler/GHC/Core/Opt/OccurAnal.hs

@@ -66,7 +66,6 @@ import GHC.Builtin.Names( runRWKey )
  import GHC.Unit.Module( Module )
  import Data.List (mapAccumL)
 -import Data.List.NonEmpty (NonEmpty (..))
  {-
  ************************************************************************
@@ -660,18 +659,35 @@ through A, so it should have ManyOcc.  Bear this case in mind!
  * In occ_env, the new (occ_join_points :: IdEnv OccInfoEnv) maps
    each in-scope non-recursive join point, such as `j` above, to
    a "zeroed form" of its RHS's usage details. The "zeroed form"
 +    * has only occ_nested_lets in its domain  (see (W4) below)
      * deletes ManyOccs
      * maps a OneOcc to OneOcc{ occ_n_br = 0 }
 -  In our example, occ_join_points will be extended with
 +  In our example, assuming `v` is locally-let-bound, occ_join_points will
 +  be extended with
        [j :-> [v :-> OneOcc{occ_n_br=0}]]
 -  See addJoinPoint.
 +  See `addJoinPoint` and (W4) below.
  * At an occurrence of a join point, we do everything as normal, but add in the
    UsageDetails from the occ_join_points.  See mkOneOcc.
 -* Crucially, at the NonRec binding of the join point, in `occAnalBind`, we use
 -  `orUDs`, not `andUDs` to combine the usage from the RHS with the usage from
 -  the body.
 +* Crucially, at the NonRec binding of a join point `j`, in `occAnalBind`,
 +  we use `combineJoinPointUDs`, not `andUDs` to combine the usage from the
 +  RHS with the usage from the body.  `combineJoinPointUDs` behaves like this:
++
 +   * For all variables than `occ_nested_lets`, use `andUDs`, just like for
 +     any normal let-binding.
++
 +   * But for a variable `v` in `occ_nested_lets`, use `orUDs`:
 +     - If `v` occurs `ManyOcc` in the join-point RHS, the variable won't be in
 +       `occ_join_points`; but we'll get `ManyOcc` anyway.
 +     - If `v` occurs `OneOcc` in the join-point RHS, the variable will be in
 +       `occ_join_points` and we'll thereby get a `OneOcc{occ_n_br=0}` from
 +       each of j's tail calls.  We can `or` that with the `OncOcc{occ_n_br=n}`
 +       from j's RHS.
++
 +  The only reason for `occ_nested_lets` is to reduce the size of the info
 +  duplicate at each tail call; see (W4). It would sound to put *all* variables
 +  into `occ_nested_lets`.
  Here are the consequences
@@ -682,13 +698,14 @@ Here are the consequences
    There are two lexical occurrences of `v`!
    (NB: `orUDs` adds occ_n_br together, so occ_n_br=1 is impossible, too.)
 -* In the tricky (P3) we'll get an `andUDs` of
 -    * OneOcc{occ_n_br=0} from the occurrences of `j`)
 +* In the tricky (P3), when analysing `case (f v) of ...`, we'll get
 +  an `andUDs` of
 +    * OneOcc{occ_n_br=0} from the occurrences of `j`
      * OneOcc{occ_n_br=1} from the (f v)
    These are `andUDs` together in `addOccInfo`, and hence
    `v` gets ManyOccs, just as it should.  Clever!
 -There are a couple of tricky wrinkles
 +There are, of course, some tricky wrinkles
  (W1) Consider this example which shadows `j`:
            join j = rhs in
@@ -718,6 +735,8 @@ There are a couple of tricky wrinkles
       * In `postprcess_uds`, we add the chucked-out join points to the
         returned UsageDetails, with `andUDs`.
 +Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3).
++
  (W3) Consider this example, which shadows `j`, but this time in an argument
                join j = rhs
                in f (case x of { K j -> ...; ... })
@@ -732,12 +751,36 @@ There are a couple of tricky wrinkles
       NB: this is just about efficiency: it is always safe /not/ to zap the
       occ_join_points.
 -(W4) What if the join point binding has a stable unfolding, or RULES?
 -     They are just alternative right-hand sides, and at each call site we
 -     will use only one of them. So again, we can use `orUDs` to combine
 -     usage info from all these alternatives RHSs.
+-
 -Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3).
 +(W4) Other things being equal, we want keep the OccInfoEnv stored in
 +  `occ_join_points` as small as possible, because it is /duplicated/ at
 +  /every occurrence/ of the join point.  We really only want to include
 +  OccInfo for
 +       * Local, non-recursive let-bound Ids
 +       * that occur just once in the RHS of the join point
 +  particularly including
 +       * thunks (that's the original point) and
 +       * join points (so that the trick works recursively).
 +  We call these the "tracked Ids of j".
++
 +  Including lambda binders is pointless, and slows down the occurrence analyser.
++
 +  e.g.    \x. let y = x+1 in
 +              join j v = ..x..y..(f z z)..
 +              in ...
 +  In the `occ_join_points` binding for `j`, we want to track `y`, but
 +  not `x` (lambda bound) nor `z` (occurs many times).
++
 +  To exploit this:
 +     * `occ_nested_lets` tracks which Ids are
 +              nested (not-top-level), non-recursive lets
 +     * `addJoinPoint` only populates j's entry with occ-info for the "tracked Ids"
 +       of `j`; that is, that are (a) in occ_nested_lets and (b) have OneOcc.
 +     * `combineJoinPointUDs` uses
 +          orLocalOcc  for local-let Ids
 +          andLocalOcc for non-local-let Ids
++
 +  This fancy footwork can matter in extreme cases: it gave a 25% reduction in
 +  total compiler allocation in #26425..
  Note [Finding join points]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -759,45 +802,45 @@ rest of 'OccInfo' until it goes on the binder.
  Note [Join arity prediction based on joinRhsArity]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 -In general, the join arity from tail occurrences of a join point (O) may be
 -higher or lower than the manifest join arity of the join body (M). E.g.,
 +In general, the join arity from tail occurrences of a join point (OAr) may be
 +higher or lower than the manifest join arity of the join body (MAr). E.g.,
 -  -- M > O:
 -  let f x y = x + y              -- M = 2
 -  in if b then f 1 else f 2      -- O = 1
 +  -- MAr > Oar:
 +  let f x y = x + y              -- MAr = 2
 +  in if b then f 1 else f 2      -- OAr = 1
    ==> { Contify for join arity 1 }
    join f x = \y -> x + y
    in if b then jump f 1 else jump f 2
 -  -- M < O
 -  let f = id                     -- M = 0
 -  in if ... then f 12 else f 13  -- O = 1
 +  -- MAr < Oar
 +  let f = id                     -- MAr = 0
 +  in if ... then f 12 else f 13  -- OAr = 1
    ==> { Contify for join arity 1, eta-expand f }
    join f x = id x
    in if b then jump f 12 else jump f 13
 -But for *recursive* let, it is crucial that both arities match up, consider
 +But for *recursive* let, it is crucial MAr=OAr.  Consider:
    letrec f x y = if ... then f x else True
    in f 42
 -Here, M=2 but O=1. If we settled for a joinrec arity of 1, the recursive jump
 +Here, MAr=2 but OAr=1. If we settled for a joinrec arity of 1, the recursive jump
  would not happen in a tail context! Contification is invalid here.
 -So indeed it is crucial to demand that M=O.
 +So indeed it is crucial to demand that MAr=OAr.
 -(Side note: Actually, we could be more specific: Let O1 be the join arity of
 -occurrences from the letrec RHS and O2 the join arity from the let body. Then
 -we need M=O1 and M<=O2 and could simply eta-expand the RHS to match O2 later.
 -M=O is the specific case where we don't want to eta-expand. Neither the join
 +(Side note: Actually, we could be more specific: Let OAr1 be the join arity of
 +occurrences from the letrec RHS and OAr2 the join arity from the let body. Then
 +we need MAr=OAr1 and MAr<=OAr2 and could simply eta-expand the RHS to match OAr2 later.
 +MAr=OAr is the specific case where we don't want to eta-expand. Neither the join
  points paper nor GHC does this at the moment.)
  We can capitalise on this observation and conclude that *if* f could become a
 -joinrec (without eta-expansion), it will have join arity M.
 -Now, M is just the result of 'joinRhsArity', a rather simple, local analysis.
 +joinrec (without eta-expansion), it will have join arity MAr.
 +Now, MAr is just the result of 'joinRhsArity', a rather simple, local analysis.
  It is also the join arity inside the 'TailUsageDetails' returned by
  'occAnalLamTail', so we can predict join arity without doing any fixed-point
  iteration or really doing any deep traversal of let body or RHS at all.
 -We check for M in the 'adjustTailUsage' call inside 'tagRecBinders'.
 +We check for MAr in the 'adjustTailUsage' call inside 'tagRecBinders'.
  All this is quite apparent if you look at the contification transformation in
  Fig. 5 of "Compiling without Continuations" (which does not account for
@@ -807,14 +850,14 @@ eta-expansion at all, mind you). The letrec case looks like this
      ... and a bunch of conditions establishing that f only occurs
          in app heads of join arity (len as + len xs) inside us and es ...
 -The syntactic form `/\as.\xs. L[us]` forces M=O iff `f` occurs in `us`. However,
 +The syntactic form `/\as.\xs. L[us]` forces MAr=OAr iff `f` occurs in `us`. However,
  for non-recursive functions, this is the definition of contification from the
  paper:
    let f = /\as.\xs.u in L[es]     ... conditions ...
 -Note that u could be a lambda itself, as we have seen. No relationship between M
 -and O to exploit here.
 +Note that u could be a lambda itself, as we have seen. No relationship between MAr
 +and OAr to exploit here.
  Note [Join points and unfoldings/rules]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -954,6 +997,22 @@ of both functions, serving as a specification:
       Cyclic Recursive case:   'tagRecBinders'
       Acyclic Recursive case:  'adjustNonRecRhs'
       Non-recursive case:      'adjustNonRecRhs'
++
 +Note [Unfoldings and RULES]
 +~~~~~~~~~~~~~~~~~~~~~~~~~~~
 +For let-bindings we treat (stable) unfoldings and RULES as "alternative right hand
 +sides".  That is, it's as if we had
 +  f = case <hiatus> of
 +         1 -> <the-rhs>
 +         2 -> <the-stable-unfolding>
 +         3 -> <rhs of rule1>
 +         4 -> <rhs of rule2>
 +So we combine all these with `orUDs` (#26567).  But actually it makes
 +very little difference whether we use `andUDs` or `orUDs` because of
 +Note [Occurrences in stable unfoldings and RULES]: occurrences in an unfolding
 +or RULE are treated as ManyOcc anyway.
++
 +But NB that tail-call info is preserved so that we don't thereby lose join points.
  -}
  ------------------------------------------------------------------
@@ -991,24 +1050,24 @@ occAnalBind !env lvl ire (NonRec bndr rhs) thing_inside combine
    | mb_join@(JoinPoint {}) <- idJoinPointHood bndr
    = -- Analyse the RHS and /then/ the body
      let -- Analyse the rhs first, generating rhs_uds
 -        !(rhs_uds_s, bndr', rhs') = occAnalNonRecRhs env lvl ire mb_join bndr rhs
 -        rhs_uds = foldl1' orUDs rhs_uds_s   -- NB: orUDs.  See (W4) of
 -                                           -- Note [Occurrence analysis for join points]
 +        !(rhs_uds, bndr', rhs') = occAnalNonRecRhs env lvl ire mb_join bndr rhs
          -- Now analyse the body, adding the join point
          -- into the environment with addJoinPoint
 -        !(WUD body_uds (occ, body)) = occAnalNonRecBody env bndr' $ \env ->
 +        env_body = addLocalLet env lvl bndr
 +        !(WUD body_uds (occ, body)) = occAnalNonRecBody env_body bndr' $ \env ->
                                        thing_inside (addJoinPoint env bndr' rhs_uds)
      in
      if isDeadOcc occ     -- Drop dead code; see Note [Dead code]
      then WUD body_uds body
 -    else WUD (rhs_uds `orUDs` body_uds)    -- Note `orUDs`
 +    else WUD (combineJoinPointUDs env rhs_uds body_uds)    -- Note `orUDs`
               (combine [NonRec (fst (tagNonRecBinder lvl occ bndr')) rhs']
                        body)
    -- The normal case, including newly-discovered join points
    -- Analyse the body and /then/ the RHS
 -  | WUD body_uds (occ,body) <- occAnalNonRecBody env bndr thing_inside
 +  | let env_body = addLocalLet env lvl bndr
 +  , WUD body_uds (occ,body) <- occAnalNonRecBody env_body bndr thing_inside
    = if isDeadOcc occ   -- Drop dead code; see Note [Dead code]
      then WUD body_uds body
      else let
@@ -1017,8 +1076,8 @@ occAnalBind !env lvl ire (NonRec bndr rhs) thing_inside combine
          -- => join arity O of Note [Join arity prediction based on joinRhsArity]
          (tagged_bndr, mb_join) = tagNonRecBinder lvl occ bndr
 -        !(rhs_uds_s, final_bndr, rhs') = occAnalNonRecRhs env lvl ire mb_join tagged_bndr rhs
 -    in WUD (foldr andUDs body_uds rhs_uds_s)      -- Note `andUDs`
 +        !(rhs_uds, final_bndr, rhs') = occAnalNonRecRhs env lvl ire mb_join tagged_bndr rhs
 +    in WUD (rhs_uds `andUDs` body_uds)      -- Note `andUDs`
             (combine [NonRec final_bndr rhs'] body)
  -----------------
@@ -1033,15 +1092,21 @@ occAnalNonRecBody env bndr thing_inside
  -----------------
  occAnalNonRecRhs :: OccEnv -> TopLevelFlag -> ImpRuleEdges
 -                -> JoinPointHood -> Id -> CoreExpr
 -                 -> (NonEmpty UsageDetails, Id, CoreExpr)
 +                 -> JoinPointHood -> Id -> CoreExpr
 +                 -> (UsageDetails, Id, CoreExpr)
  occAnalNonRecRhs !env lvl imp_rule_edges mb_join bndr rhs
    | null rules, null imp_rule_infos
    =  -- Fast path for common case of no rules. This is only worth
       -- 0.1% perf on average, but it's also only a line or two of code
 -    ( adj_rhs_uds :| adj_unf_uds : [], final_bndr_no_rules, final_rhs )
 +    ( adj_rhs_uds `orUDs` adj_unf_uds
 +    , final_bndr_no_rules, final_rhs )
++
    | otherwise
 -  = ( adj_rhs_uds :| adj_unf_uds : adj_rule_uds, final_bndr_with_rules, final_rhs )
 +  = ( foldl' orUDs (adj_rhs_uds `orUDs` adj_unf_uds) adj_rule_uds
 +    , final_bndr_with_rules, final_rhs )
++
 +    -- orUDs: Combine the RHS, (stable) unfolding, and RULES with orUDs
 +    --        See Note [Unfoldings and RULES]
    where
      --------- Right hand side ---------
      -- For join points, set occ_encl to OccVanilla, via setTailCtxt.  If we have
@@ -1054,7 +1119,7 @@ occAnalNonRecRhs !env lvl imp_rule_edges mb_join bndr rhs
      rhs_ctxt = mkNonRecRhsCtxt lvl bndr unf
      -- See Note [Join arity prediction based on joinRhsArity]
 -    -- Match join arity O from mb_join_arity with manifest join arity M as
 +    -- Match join arity OAr from mb_join_arity with manifest join arity MAr as
      -- returned by of occAnalLamTail. It's totally OK for them to mismatch;
      -- hence adjust the UDs from the RHS
@@ -1764,7 +1829,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs)
                                 -- here because that is what we are setting!
      WTUD unf_tuds unf' = occAnalUnfolding rhs_env unf
      adj_unf_uds = adjustTailArity (JoinPoint rhs_ja) unf_tuds
 -      -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source M
 +      -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source MAr
        -- of Note [Join arity prediction based on joinRhsArity]
      --------- IMP-RULES --------
@@ -1775,7 +1840,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs)
      --------- All rules --------
      -- See Note [Join points and unfoldings/rules]
 -    -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source M
 +    -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source MAr
      -- of Note [Join arity prediction based on joinRhsArity]
      rules_w_uds :: [(CoreRule, UsageDetails, UsageDetails)]
      rules_w_uds = [ (r,l,adjustTailArity (JoinPoint rhs_ja) rhs_wuds)
@@ -2177,7 +2242,9 @@ occAnalLamTail :: OccEnv -> CoreExpr -> WithTailUsageDetails CoreExpr
  -- See Note [Adjusting right-hand sides]
  occAnalLamTail env expr
    = let !(WUD usage expr') = occ_anal_lam_tail env expr
 -    in WTUD (TUD (joinRhsArity expr) usage) expr'
 +    in WTUD (TUD (joinRhsArity expr') usage) expr'
 +       -- If expr looks like (\x. let dead = e in \y. blah), where `dead` is dead
 +       -- then joinRhsArity expr' might exceed joinRhsArity expr
  occ_anal_lam_tail :: OccEnv -> CoreExpr -> WithUsageDetails CoreExpr
  -- Does not markInsideLam etc for the outmost batch of lambdas
@@ -2281,7 +2348,7 @@ occAnalUnfolding !env unf
                WTUD (TUD rhs_ja uds) rhs' = occAnalLamTail env rhs
                unf' = unf { uf_tmpl = rhs' }
              in WTUD (TUD rhs_ja (markAllMany uds)) unf'
 -              -- markAllMany: see Note [Occurrences in stable unfoldings]
 +              -- markAllMany: see Note [Occurrences in stable unfoldings and RULES]
          | otherwise -> WTUD (TUD 0 emptyDetails) unf
                -- For non-Stable unfoldings we leave them undisturbed, but
@@ -2319,12 +2386,13 @@ occAnalRule env rule@(Rule { ru_bndrs = bndrs, ru_args = args, ru_rhs = rhs })
                            -- Note [Rules are extra RHSs]
                            -- Note [Rule dependency info]
      rhs_uds' = markAllMany rhs_uds
 +               -- markAllMany: Note [Occurrences in stable unfoldings and RULES]
      rhs_ja = length args -- See Note [Join points and unfoldings/rules]
  occAnalRule _ other_rule = (other_rule, emptyDetails, TUD 0 emptyDetails)
 -{- Note [Occurrences in stable unfoldings]
 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 +{- Note [Occurrences in stable unfoldings and RULES]
 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Consider
      f p = BIG
      {-# INLINE g #-}
@@ -2338,7 +2406,7 @@ preinlineUnconditionally here!
  The INLINE pragma says "inline exactly this RHS"; perhaps the
  programmer wants to expose that 'not', say. If we inline f that will make
 -the Stable unfoldign big, and that wasn't what the programmer wanted.
 +the Stable unfolding big, and that wasn't what the programmer wanted.
  Another way to think about it: if we inlined g as-is into multiple
  call sites, now there's be multiple calls to f.
@@ -2347,6 +2415,8 @@ Bottom line: treat all occurrences in a stable unfolding as "Many".
  We still leave tail call information intact, though, as to not spoil
  potential join points.
 +The same goes for RULES.
++
  Note [Unfoldings and rules]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Generally unfoldings and rules are already occurrence-analysed, so we
@@ -2598,7 +2668,7 @@ occAnalArgs :: OccEnv -> CoreExpr -> [CoreExpr]
              -> WithUsageDetails CoreExpr
  -- The `fun` argument is just an accumulating parameter,
  -- the base for building the application we return
 -occAnalArgs !env fun args !one_shots
 +occAnalArgs env fun args one_shots
    = go emptyDetails fun args one_shots
    where
      env_args = setNonTailCtxt encl env
@@ -2657,8 +2727,19 @@ Constructors are rather like lambdas in this way.
  occAnalApp :: OccEnv
             -> (Expr CoreBndr, [Arg CoreBndr], [CoreTickish])
             -> WithUsageDetails (Expr CoreBndr)
 --- Naked variables (not applied) end up here too
 -occAnalApp !env (Var fun, args, ticks)
 +occAnalApp !env (Var fun_id, [], ticks)
 +  = -- Naked variables (not applied) end up here too, and it's worth giving
 +    -- this common case special treatment, because there is so much less to do.
 +    -- This is just a specialised copy of the (Var fun_id) case below
 +    WUD fun_uds (mkTicks ticks fun')
 +  where
 +    !(fun', fun_id')  = lookupBndrSwap env fun_id
 +    !fun_uds = mkOneOcc env fun_id' int_cxt 0
 +    !int_cxt = case occ_encl env of
 +                   OccScrut -> IsInteresting
 +                   _other   -> NotInteresting
++
 +occAnalApp env (Var fun, args, ticks)
    -- Account for join arity of runRW# continuation
    -- See Note [Simplification of runRW#]
    --
@@ -2863,7 +2944,11 @@ data OccEnv
               -- Invariant: no Id maps to an empty OccInfoEnv
               -- See Note [Occurrence analysis for join points]
             , occ_join_points :: !JoinPointInfo
 -    }
++
 +           , occ_nested_lets :: IdSet    -- Non-top-level, non-rec-bound lets
 +                -- I tried making this field strict, but doing so increased
 +                -- compile-time allocation very slightly: 0.1% on average
 +           }
  type JoinPointInfo = IdEnv OccInfoEnv
@@ -2914,7 +2999,8 @@ initOccEnv
             , occ_join_points = emptyVarEnv
             , occ_bs_env = emptyVarEnv
 -           , occ_bs_rng = emptyVarSet }
 +           , occ_bs_rng = emptyVarSet
 +           , occ_nested_lets = emptyVarSet }
  noBinderSwaps :: OccEnv -> Bool
  noBinderSwaps (OccEnv { occ_bs_env = bs_env }) = isEmptyVarEnv bs_env
@@ -3154,23 +3240,26 @@ postprocess_uds bndrs bad_joins uds
        | uniq `elemVarEnvByKey` env = plusVarEnv_C andLocalOcc env join_env
        | otherwise                  = env
 +addLocalLet :: OccEnv -> TopLevelFlag -> Id -> OccEnv
 +addLocalLet env@(OccEnv { occ_nested_lets = ids }) top_lvl id
 +  | isTopLevel top_lvl = env
 +  | otherwise          = env { occ_nested_lets = ids `extendVarSet` id }
++
  addJoinPoint :: OccEnv -> Id -> UsageDetails -> OccEnv
 -addJoinPoint env bndr rhs_uds
 +addJoinPoint env@(OccEnv { occ_join_points = join_points, occ_nested_lets = nested_lets })
 +             join_bndr (UD { ud_env = rhs_occs })
    | isEmptyVarEnv zeroed_form
    = env
    | otherwise
 -  = env { occ_join_points = extendVarEnv (occ_join_points env) bndr zeroed_form }
 +  = env { occ_join_points = extendVarEnv join_points join_bndr zeroed_form }
    where
 -    zeroed_form = mkZeroedForm rhs_uds
 +    zeroed_form = mapMaybeUniqSetToUFM do_one nested_lets
 +     -- See Note [Occurrence analysis for join points] for "zeroed form"
 -mkZeroedForm :: UsageDetails -> OccInfoEnv
 --- See Note [Occurrence analysis for join points] for "zeroed form"
 -mkZeroedForm (UD { ud_env = rhs_occs })
 -  = mapMaybeUFM do_one rhs_occs
 -  where
 -    do_one :: LocalOcc -> Maybe LocalOcc
 -    do_one (ManyOccL {})    = Nothing
 -    do_one occ@(OneOccL {}) = Just (occ { lo_n_br = 0 })
 +    do_one :: Var -> Maybe LocalOcc
 +    do_one bndr = case lookupVarEnv rhs_occs bndr of
 +                    Just occ@(OneOccL {}) -> Just (occ { lo_n_br = 0 })
 +                    _                     -> Nothing
  --------------------
  transClosureFV :: VarEnv VarSet -> VarEnv VarSet
@@ -3628,7 +3717,14 @@ data LocalOcc  -- See Note [LocalOcc]
                     -- Combining (AlwaysTailCalled 2) and (AlwaysTailCalled 3)
                     -- gives NoTailCallInfo
                , lo_int_cxt :: !InterestingCxt }
++
      | ManyOccL !TailCallInfo
 +       -- Why do we need TailCallInfo on ManyOccL?
 +       -- Answer 1: recursive bindings are entered many times:
 +       --    rec { j x = ...j x'... } in j y
 +       -- See the uses of `andUDs` in `tagRecBinders`
 +       -- Answer 2: occurrences in stable unfoldings are many-ified
 +       --           See Note [Occurrences in stable unfoldings and RULES]
  instance Outputable LocalOcc where
    ppr (OneOccL { lo_n_br = n, lo_tail = tci })
@@ -3651,10 +3747,13 @@ data UsageDetails
  instance Outputable UsageDetails where
    ppr ud@(UD { ud_env = env, ud_z_tail = z_tail })
 -    = text "UD" <+> (braces $ fsep $ punctuate comma $
 -      [ ppr uq <+> text ":->" <+> ppr (lookupOccInfoByUnique ud uq)
 -      | (uq, _) <- nonDetStrictFoldVarEnv_Directly do_one [] env ])
 -      $$ nest 2 (text "ud_z_tail" <+> ppr z_tail)
 +    = text "UD" <> (braces (vcat
 +         [ -- `final` shows the result of a proper lookupOccInfo, returning OccInfo
 +           --         after accounting for `ud_z_tail` etc.
 +           text "final =" <+> (fsep $ punctuate comma $
 +                 [ ppr uq <+> text ":->" <+> ppr (lookupOccInfoByUnique ud uq)
 +                 | (uq, _) <- nonDetStrictFoldVarEnv_Directly do_one [] env ])
 +         , text "ud_z_tail" <+> ppr z_tail ] ))
      where
        do_one :: Unique -> LocalOcc -> [(Unique,LocalOcc)] -> [(Unique,LocalOcc)]
        do_one uniq occ occs = (uniq, occ) : occs
@@ -3663,7 +3762,7 @@ instance Outputable UsageDetails where
  -- | TailUsageDetails captures the result of applying 'occAnalLamTail'
  --   to a function `\xyz.body`. The TailUsageDetails pairs together
  --   * the number of lambdas (including type lambdas: a JoinArity)
 ---   * UsageDetails for the `body` of the lambda, unadjusted by `adjustTailUsage`.
 +--   * UsageDetails for the `body` of the lambda, /unadjusted/ by `adjustTailUsage`.
  -- If the binding turns out to be a join point with the indicated join
  -- arity, this unadjusted usage details is just what we need; otherwise we
  -- need to discard tail calls. That's what `adjustTailUsage` does.
@@ -3681,8 +3780,17 @@ data WithTailUsageDetails a = WTUD !TailUsageDetails !a
  andUDs:: UsageDetails -> UsageDetails -> UsageDetails
  orUDs :: UsageDetails -> UsageDetails -> UsageDetails
 -andUDs = combineUsageDetailsWith andLocalOcc
 -orUDs  = combineUsageDetailsWith orLocalOcc
 +andUDs = combineUsageDetailsWith (\_uniq -> andLocalOcc)
 +orUDs  = combineUsageDetailsWith (\_uniq -> orLocalOcc)
++
 +combineJoinPointUDs :: OccEnv -> UsageDetails -> UsageDetails -> UsageDetails
 +-- See (W4) in Note [Occurrence analysis for join points]
 +combineJoinPointUDs (OccEnv { occ_nested_lets = nested_lets }) uds1 uds2
 +  = combineUsageDetailsWith combine uds1 uds2
 +  where
 +    combine uniq occ1 occ2
 +      | uniq `elemVarSetByKey` nested_lets = orLocalOcc  occ1 occ2
 +      | otherwise                          = andLocalOcc occ1 occ2
  mkOneOcc :: OccEnv -> Id -> InterestingCxt -> JoinArity -> UsageDetails
  mkOneOcc !env id int_cxt arity
@@ -3699,7 +3807,8 @@ mkOneOcc !env id int_cxt arity
    = mkSimpleDetails (unitVarEnv id occ)
    where
 -    occ = OneOccL { lo_n_br = 1, lo_int_cxt = int_cxt
 +    occ = OneOccL { lo_n_br = 1
 +                  , lo_int_cxt = int_cxt
                    , lo_tail = AlwaysTailCalled arity }
  -- Add several occurrences, assumed not to be tail calls
@@ -3786,7 +3895,7 @@ restrictFreeVars bndrs fvs = restrictUniqSetToUFM bndrs fvs
  -------------------
  -- Auxiliary functions for UsageDetails implementation
 -combineUsageDetailsWith :: (LocalOcc -> LocalOcc -> LocalOcc)
 +combineUsageDetailsWith :: (Unique -> LocalOcc -> LocalOcc -> LocalOcc)
                          -> UsageDetails -> UsageDetails -> UsageDetails
  {-# INLINE combineUsageDetailsWith #-}
  combineUsageDetailsWith plus_occ_info
@@ -3796,9 +3905,9 @@ combineUsageDetailsWith plus_occ_info
    | isEmptyVarEnv env2 = uds1
    | otherwise
    -- See Note [Strictness in the occurrence analyser]
 -  -- Using strictPlusVarEnv here speeds up the test T26425 by about 10% by avoiding
 -  -- intermediate thunks.
 -  = UD { ud_env       = strictPlusVarEnv_C plus_occ_info env1 env2
 +  -- Using strictPlusVarEnv here speeds up the test T26425
 +  -- by about 10% by avoiding intermediate thunks.
 +  = UD { ud_env       = strictPlusVarEnv_C_Directly plus_occ_info env1 env2
         , ud_z_many    = strictPlusVarEnv z_many1   z_many2
         , ud_z_in_lam  = plusVarEnv z_in_lam1 z_in_lam2
         , ud_z_tail    = strictPlusVarEnv z_tail1   z_tail2 }
@@ -3842,8 +3951,6 @@ lookupOccInfoByUnique (UD { ud_env       = env
          | uniq `elemVarEnvByKey` z_tail = NoTailCallInfo
          | otherwise                     = ti
+-
+-
  -------------------
  -- See Note [Adjusting right-hand sides]
@@ -3853,21 +3960,22 @@ adjustNonRecRhs :: JoinPointHood
  -- ^ This function concentrates shared logic between occAnalNonRecBind and the
  -- AcyclicSCC case of occAnalRec.
  -- It returns the adjusted rhs UsageDetails combined with the body usage
 -adjustNonRecRhs mb_join_arity rhs_wuds@(WTUD _ rhs)
 -  = WUD (adjustTailUsage mb_join_arity rhs_wuds) rhs
+-
 +adjustNonRecRhs mb_join_arity (WTUD (TUD rhs_ja uds) rhs)
 +  = WUD (adjustTailUsage exact_join rhs uds) rhs
 +  where
 +    exact_join = mb_join_arity == JoinPoint rhs_ja
 -adjustTailUsage :: JoinPointHood
 -                -> WithTailUsageDetails CoreExpr    -- Rhs usage, AFTER occAnalLamTail
 +adjustTailUsage :: Bool        -- True <=> Exactly-matching join point; don't do markNonTail
 +                -> CoreExpr    -- Rhs usage, AFTER occAnalLamTail
 +                -> UsageDetails
                  -> UsageDetails
 -adjustTailUsage mb_join_arity (WTUD (TUD rhs_ja uds) rhs)
 +adjustTailUsage exact_join rhs uds
    = -- c.f. occAnal (Lam {})
      markAllInsideLamIf (not one_shot) $
      markAllNonTailIf (not exact_join) $
      uds
    where
      one_shot   = isOneShotFun rhs
 -    exact_join = mb_join_arity == JoinPoint rhs_ja
  adjustTailArity :: JoinPointHood -> TailUsageDetails -> UsageDetails
  adjustTailArity mb_rhs_ja (TUD ja usage)
@@ -3914,8 +4022,9 @@ tagNonRecBinder lvl occ bndr
  tagRecBinders :: TopLevelFlag           -- At top level?
                -> UsageDetails           -- Of body of let ONLY
                -> [NodeDetails]
 -              -> WithUsageDetails       -- Adjusted details for whole scope,
 -                                        -- with binders removed
 +              -> WithUsageDetails       -- Adjusted details for whole scope
 +                                        -- still including the binders;
 +                                        -- (they are removed by `addInScope`)
                    [IdWithOccInfo]       -- Tagged binders
  -- Substantially more complicated than non-recursive case. Need to adjust RHS
  -- details *before* tagging binders (because the tags depend on the RHSes).
@@ -3925,32 +4034,21 @@ tagRecBinders lvl body_uds details_s
       -- 1. See Note [Join arity prediction based on joinRhsArity]
       --    Determine possible join-point-hood of whole group, by testing for
 -     --    manifest join arity M.
 -     --    This (re-)asserts that makeNode had made tuds for that same arity M!
 +     --    manifest join arity MAr.
 +     --    This (re-)asserts that makeNode had made tuds for that same arity MAr!
       unadj_uds = foldr (andUDs . test_manifest_arity) body_uds details_s
 -     test_manifest_arity ND{nd_rhs = WTUD tuds rhs}
 -       = adjustTailArity (JoinPoint (joinRhsArity rhs)) tuds
 +     test_manifest_arity ND{nd_rhs = WTUD (TUD rhs_ja uds) rhs}
 +       = assertPpr (rhs_ja == joinRhsArity rhs) (ppr rhs_ja $$ ppr uds $$ ppr rhs) $
 +         uds
 +     will_be_joins :: Bool
       will_be_joins = decideRecJoinPointHood lvl unadj_uds bndrs
 -     mb_join_arity :: Id -> JoinPointHood
 -     -- mb_join_arity: See Note [Join arity prediction based on joinRhsArity]
 -     -- This is the source O
 -     mb_join_arity bndr
 -         -- Can't use willBeJoinId_maybe here because we haven't tagged
 -         -- the binder yet (the tag depends on these adjustments!)
 -       | will_be_joins
 -       , AlwaysTailCalled arity <- lookupTailCallInfo unadj_uds bndr
 -       = JoinPoint arity
 -       | otherwise
 -       = assert (not will_be_joins) -- Should be AlwaysTailCalled if
 -         NotJoinPoint               -- we are making join points!
+-
       -- 2. Adjust usage details of each RHS, taking into account the
       --    join-point-hood decision
 -     rhs_udss' = [ adjustTailUsage (mb_join_arity bndr) rhs_wuds
 +     rhs_udss' = [ adjustTailUsage will_be_joins rhs rhs_uds
                       -- Matching occAnalLamTail in makeNode
 -                 | ND { nd_bndr = bndr, nd_rhs = rhs_wuds } <- details_s ]
 +                 | ND { nd_rhs = WTUD (TUD _ rhs_uds) rhs } <- details_s ]
       -- 3. Compute final usage details from adjusted RHS details
       adj_uds = foldr andUDs body_uds rhs_udss'
@@ -3969,9 +4067,9 @@ setBinderOcc occ_info bndr
    | otherwise                  = setIdOccInfo bndr occ_info
  -- | Decide whether some bindings should be made into join points or not, based
 --- on its occurrences. This is
 +-- on its occurrences.
  -- Returns `False` if they can't be join points. Note that it's an
 +-- all-or-nothing decision: if multiple binders are given, they are
  -- assumed to be mutually recursive.
  --
  -- It must, however, be a final decision. If we say `True` for 'f',

compiler/GHC/HsToCore/Binds.hs

@@ -1620,7 +1620,7 @@ ds_hs_wrapper hs_wrap
                                do { x <- newSysLocalDs (mkScaled (subMultCoRKind w_co) t)
                                   ; go c1 $ \w1 ->
                                     go c2 $ \w2 ->
 -                                   let app f a = mkCoreApp (text "dsHsWrapper") f a
 +                                   let app f a = mkCoreApp f a
                                         arg     = w1 (Var x)
                                     in k (\e -> (Lam x (w2 (app e arg)))) }

compiler/GHC/HsToCore/Expr.hs

@@ -877,8 +877,7 @@ dsHsConLike (PatSynCon ps)
    | Just (builder_name, _, add_void) <- patSynBuilder ps
    = do { builder_id <- dsLookupGlobalId builder_name
         ; return (if add_void
 -                 then mkCoreApp (text "dsConLike" <+> ppr ps)
 -                                (Var builder_id) unboxedUnitExpr
 +                 then mkCoreApp (Var builder_id) unboxedUnitExpr
                   else Var builder_id) }
    | otherwise
    = pprPanic "dsConLike" (ppr ps)

compiler/GHC/HsToCore/Match.hs

@@ -301,7 +301,7 @@ matchView (var :| vars) ty eqns@(eqn1 :| _)
           -- compile the view expressions
          ; viewExpr' <- dsExpr viewExpr
          ; return (mkViewMatchResult var'
 -                    (mkCoreApp (text "matchView") viewExpr' (Var var))
 +                    (mkCoreApp viewExpr' (Var var))
                      match_result) }
  -- decompose the first pattern and leave the rest alone

compiler/GHC/Types/Basic.hs

@@ -1333,7 +1333,7 @@ zapFragileOcc occ         = zapOccTailCallInfo occ
  instance Outputable OccInfo where
    -- only used for debugging; never parsed.  KSW 1999-07
 -  ppr (ManyOccs tails)     = pprShortTailCallInfo tails
 +  ppr (ManyOccs tails)     = text "Many" <> parens (pprShortTailCallInfo tails)
    ppr IAmDead              = text "Dead"
    ppr (IAmALoopBreaker rule_only tails)
          = text "LoopBreaker" <> pp_ro <> pprShortTailCallInfo tails

compiler/GHC/Types/Unique/FM.hs

@@ -53,7 +53,7 @@ module GHC.Types.Unique.FM (
          plusUFM,
          strictPlusUFM,
          plusUFM_C,
 -        strictPlusUFM_C,
 +        strictPlusUFM_C, strictPlusUFM_C_Directly,
          plusUFM_CD,
          plusUFM_CD2,
          mergeUFM,
@@ -281,6 +281,9 @@ plusUFM_C f (UFM x) (UFM y) = UFM (M.unionWith f x y)
  strictPlusUFM_C :: (elt -> elt -> elt) -> UniqFM key elt -> UniqFM key elt -> UniqFM key elt
  strictPlusUFM_C f (UFM x) (UFM y) = UFM (MS.unionWith f x y)
 +strictPlusUFM_C_Directly :: (Unique -> elt -> elt -> elt) -> UniqFM key elt -> UniqFM key elt -> UniqFM key elt
 +strictPlusUFM_C_Directly f (UFM x) (UFM y) = UFM (MS.unionWithKey (f . mkUniqueGrimily) x y)
++
  -- | `plusUFM_CD f m1 d1 m2 d2` merges the maps using `f` as the
  -- combinding function and `d1` resp. `d2` as the default value if
  -- there is no entry in `m1` reps. `m2`. The domain is the union of

compiler/GHC/Types/Unique/Set.hs

@@ -40,6 +40,7 @@ module GHC.Types.Unique.Set (
          lookupUniqSet_Directly,
          partitionUniqSet,
          mapUniqSet,
 +        mapUniqSetToUFM, mapMaybeUniqSetToUFM,
          unsafeUFMToUniqSet,
          nonDetEltsUniqSet,
          nonDetKeysUniqSet,
@@ -211,6 +212,14 @@ mapUniqSet f = mkUniqSet . map f . nonDetEltsUniqSet
  mapMaybeUniqSet_sameUnique :: (a -> Maybe b) -> UniqSet a -> UniqSet b
  mapMaybeUniqSet_sameUnique f (UniqSet a) = UniqSet $ mapMaybeUFM_sameUnique f a
 +mapUniqSetToUFM :: (a -> b) -> UniqSet a -> UniqFM a b
 +-- Same keys, new values
 +mapUniqSetToUFM f (UniqSet ufm) = mapUFM f ufm
++
 +mapMaybeUniqSetToUFM :: (a -> Maybe b) -> UniqSet a -> UniqFM a b
 +-- Same keys, new values
 +mapMaybeUniqSetToUFM f (UniqSet ufm) = mapMaybeUFM f ufm
++
  -- Two 'UniqSet's are considered equal if they contain the same
  -- uniques.
  instance Eq (UniqSet a) where

compiler/GHC/Types/Var/Env.hs

@@ -12,7 +12,8 @@ module GHC.Types.Var.Env (
          elemVarEnv, disjointVarEnv, anyVarEnv,
          extendVarEnv, extendVarEnv_C, extendVarEnv_Acc,
          extendVarEnvList,
 -        strictPlusVarEnv, plusVarEnv, plusVarEnv_C, strictPlusVarEnv_C,
 +        strictPlusVarEnv, plusVarEnv, plusVarEnv_C,
 +        strictPlusVarEnv_C, strictPlusVarEnv_C_Directly,
          plusVarEnv_CD, plusMaybeVarEnv_C,
          plusVarEnvList, alterVarEnv,
          delVarEnvList, delVarEnv,
@@ -525,6 +526,7 @@ delVarEnv         :: VarEnv a -> Var -> VarEnv a
  minusVarEnv       :: VarEnv a -> VarEnv b -> VarEnv a
  plusVarEnv_C      :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a
  strictPlusVarEnv_C :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a
 +strictPlusVarEnv_C_Directly :: (Unique -> a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a
  plusVarEnv_CD     :: (a -> a -> a) -> VarEnv a -> a -> VarEnv a -> a -> VarEnv a
  plusMaybeVarEnv_C :: (a -> a -> Maybe a) -> VarEnv a -> VarEnv a -> VarEnv a
  mapVarEnv         :: (a -> b) -> VarEnv a -> VarEnv b
@@ -552,6 +554,7 @@ extendVarEnv_Acc = addToUFM_Acc
  extendVarEnvList = addListToUFM
  plusVarEnv_C     = plusUFM_C
  strictPlusVarEnv_C = strictPlusUFM_C
 +strictPlusVarEnv_C_Directly = strictPlusUFM_C_Directly
  plusVarEnv_CD    = plusUFM_CD
  plusMaybeVarEnv_C = plusMaybeUFM_C
  delVarEnvList    = delListFromUFM

testsuite/tests/unboxedsums/UbxSumUnpackedSize.hs

@@ -133,6 +133,124 @@ data U_E1 = U_E1 {-# UNPACK #-} !E1
                   {-# UNPACK #-} !Int8
      deriving (Show)
 +{- In `data U_E`, the {-# UNPACK #-} !E1 gives rise to a pretty clumsy expression
 +   for the wrapper for U_E1. Here is what it looks like when ther are only 16
 +   data constructors in E1, and we have just
 +       data U_E1 = U_E1 {-# UNPACK #-} !E1
 +   Blimey!
++
 +Main.$WU_E1
 +  = \ (conrep_t1N4 [Occ=Once1!] :: Main.E1) ->
 +      case case conrep_t1N4 of {
 +             Main.E1_1 ->
 +               GHC.Internal.Types.(# _| | | | | | | | | | | | | | | #)
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 GHC.Internal.Types.(##);
 +             Main.E1_2 ->
 +               GHC.Internal.Types.(# |_| | | | | | | | | | | | | | #)
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 GHC.Internal.Types.(##);
 +             Main.E1_3 ->
 +               GHC.Internal.Types.(# | |_| | | | | | | | | | | | | #)
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @GHC.Internal.Types.ZeroBitRep
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 @(# #)
 +                 GHC.Internal.Types.(##);
++
 +       ... etc ....
 +-}
++
  data U_E2 = U_E2 {-# UNPACK #-} !E2
                   {-# UNPACK #-} !Int8
                   {-# UNPACK #-} !Int8