| ... |
... |
@@ -664,14 +664,14 @@ through A, so it should have ManyOcc. Bear this case in mind! |
|
664
|
664
|
* maps a OneOcc to OneOcc{ occ_n_br = 0 }
|
|
665
|
665
|
In our example, occ_join_points will be extended with
|
|
666
|
666
|
[j :-> [v :-> OneOcc{occ_n_br=0}]]
|
|
667
|
|
- See addJoinPoint.
|
|
|
667
|
+ See `addJoinPoint` and (W5) below.
|
|
668
|
668
|
|
|
669
|
669
|
* At an occurrence of a join point, we do everything as normal, but add in the
|
|
670
|
670
|
UsageDetails from the occ_join_points. See mkOneOcc.
|
|
671
|
671
|
|
|
672
|
672
|
* Crucially, at the NonRec binding of the join point, in `occAnalBind`, we use
|
|
673
|
|
- `orUDs`, not `andUDs` to combine the usage from the RHS with the usage from
|
|
674
|
|
- the body.
|
|
|
673
|
+ `combineJoinPointUDs`, not `andUDs` to combine the usage from the RHS with
|
|
|
674
|
+ the usage from the body.
|
|
675
|
675
|
|
|
676
|
676
|
Here are the consequences
|
|
677
|
677
|
|
| ... |
... |
@@ -688,7 +688,7 @@ Here are the consequences |
|
688
|
688
|
These are `andUDs` together in `addOccInfo`, and hence
|
|
689
|
689
|
`v` gets ManyOccs, just as it should. Clever!
|
|
690
|
690
|
|
|
691
|
|
-There are a couple of tricky wrinkles
|
|
|
691
|
+There are, of course, some tricky wrinkles
|
|
692
|
692
|
|
|
693
|
693
|
(W1) Consider this example which shadows `j`:
|
|
694
|
694
|
join j = rhs in
|
| ... |
... |
@@ -718,6 +718,8 @@ There are a couple of tricky wrinkles |
|
718
|
718
|
* In `postprcess_uds`, we add the chucked-out join points to the
|
|
719
|
719
|
returned UsageDetails, with `andUDs`.
|
|
720
|
720
|
|
|
|
721
|
+Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3).
|
|
|
722
|
+
|
|
721
|
723
|
(W3) Consider this example, which shadows `j`, but this time in an argument
|
|
722
|
724
|
join j = rhs
|
|
723
|
725
|
in f (case x of { K j -> ...; ... })
|
| ... |
... |
@@ -734,10 +736,38 @@ There are a couple of tricky wrinkles |
|
734
|
736
|
|
|
735
|
737
|
(W4) What if the join point binding has a stable unfolding, or RULES?
|
|
736
|
738
|
They are just alternative right-hand sides, and at each call site we
|
|
737
|
|
- will use only one of them. So again, we can use `orUDs` to combine
|
|
738
|
|
- usage info from all these alternatives RHSs.
|
|
739
|
|
-
|
|
740
|
|
-Wrinkles (W1) and (W2) are very similar to Note [Binder swap] (BS3).
|
|
|
739
|
+ will use only one of them. So again, we can use `combineJoinPointUDs`
|
|
|
740
|
+ to combine usage info from all these alternatives RHSs.
|
|
|
741
|
+
|
|
|
742
|
+(W5) Other things being equal, we want keep the OccInfoEnv in the range of
|
|
|
743
|
+ `occ_join_points` as small as possible, because it is /duplicated/ at
|
|
|
744
|
+ /every occurrence/ of the join point. We really only want to include
|
|
|
745
|
+ OccInfo for
|
|
|
746
|
+ * Local, non-recursive let-bound Ids
|
|
|
747
|
+ * that occur just once in the RHS of the join point
|
|
|
748
|
+ particularly including
|
|
|
749
|
+ * thunks (that's the original point) and
|
|
|
750
|
+ * join points (so that the trick works recursively).
|
|
|
751
|
+ We call these the "tracked Ids of j".
|
|
|
752
|
+
|
|
|
753
|
+ Including lambda binders is pointless, and slows down the occurrence analyser.
|
|
|
754
|
+
|
|
|
755
|
+ e.g. \x. let y = x+1 in
|
|
|
756
|
+ join j v = ..x..y..(f z z)..
|
|
|
757
|
+ in ...
|
|
|
758
|
+ In the `occ_join_points` binding for `j`, we want to track `y`, but
|
|
|
759
|
+ not `x` (lambda bound) nor `z` (occurs many times).
|
|
|
760
|
+
|
|
|
761
|
+ To exploit this:
|
|
|
762
|
+ * `occ_local_lets` tracks which Ids are local, non-recursive lets
|
|
|
763
|
+ * `addJoinPoint` only populates j's entry with occ-info for the "tracked Ids"
|
|
|
764
|
+ of `j`; that is, that are (a) in occ_local_lets and (b) have OneOcc.
|
|
|
765
|
+ * `combineJoinPointUDs` uses
|
|
|
766
|
+ orLocalOcc for local-let Ids
|
|
|
767
|
+ andLocalOcc for non-local-let Ids
|
|
|
768
|
+
|
|
|
769
|
+ This fancy footwork can matter in extreme cases: it gave a 25% reduction in
|
|
|
770
|
+ total compiler allocation in #26425..
|
|
741
|
771
|
|
|
742
|
772
|
Note [Finding join points]
|
|
743
|
773
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| ... |
... |
@@ -759,62 +789,62 @@ rest of 'OccInfo' until it goes on the binder. |
|
759
|
789
|
|
|
760
|
790
|
Note [Join arity prediction based on joinRhsArity]
|
|
761
|
791
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
762
|
|
-In general, the join arity from tail occurrences of a join point (O) may be
|
|
763
|
|
-higher or lower than the manifest join arity of the join body (M). E.g.,
|
|
|
792
|
+In general, the join arity from tail occurrences of a join point (OAr) may be
|
|
|
793
|
+higher or lower than the manifest join arity of the join body (MAr). E.g.,
|
|
764
|
794
|
|
|
765
|
|
- -- M > O:
|
|
766
|
|
- let f x y = x + y -- M = 2
|
|
767
|
|
- in if b then f 1 else f 2 -- O = 1
|
|
|
795
|
+ -- MAr > Oar:
|
|
|
796
|
+ let f x y = x + y -- MAr = 2
|
|
|
797
|
+ in if b then f 1 else f 2 -- OAr = 1
|
|
768
|
798
|
==> { Contify for join arity 1 }
|
|
769
|
799
|
join f x = \y -> x + y
|
|
770
|
800
|
in if b then jump f 1 else jump f 2
|
|
771
|
801
|
|
|
772
|
|
- -- M < O
|
|
773
|
|
- let f = id -- M = 0
|
|
774
|
|
- in if ... then f 12 else f 13 -- O = 1
|
|
|
802
|
+ -- MAr < Oar
|
|
|
803
|
+ let f = id -- MAr = 0
|
|
|
804
|
+ in if ... then f 12 else f 13 -- OAr = 1
|
|
775
|
805
|
==> { Contify for join arity 1, eta-expand f }
|
|
776
|
806
|
join f x = id x
|
|
777
|
807
|
in if b then jump f 12 else jump f 13
|
|
778
|
808
|
|
|
779
|
|
-But for *recursive* let, it is crucial that both arities match up, consider
|
|
|
809
|
+But for *recursive* let, it is crucial MAr=OAr. Consider:
|
|
780
|
810
|
|
|
781
|
811
|
letrec f x y = if ... then f x else True
|
|
782
|
812
|
in f 42
|
|
783
|
813
|
|
|
784
|
|
-Here, M=2 but O=1. If we settled for a joinrec arity of 1, the recursive jump
|
|
|
814
|
+Here, MAr=2 but OAr=1. If we settled for a joinrec arity of 1, the recursive jump
|
|
785
|
815
|
would not happen in a tail context! Contification is invalid here.
|
|
786
|
|
-So indeed it is crucial to demand that M=O.
|
|
|
816
|
+So indeed it is crucial to demand that MAr=OAr.
|
|
787
|
817
|
|
|
788
|
|
-(Side note: Actually, we could be more specific: Let O1 be the join arity of
|
|
789
|
|
-occurrences from the letrec RHS and O2 the join arity from the let body. Then
|
|
790
|
|
-we need M=O1 and M<=O2 and could simply eta-expand the RHS to match O2 later.
|
|
791
|
|
-M=O is the specific case where we don't want to eta-expand. Neither the join
|
|
|
818
|
+(Side note: Actually, we could be more specific: Let OAr1 be the join arity of
|
|
|
819
|
+occurrences from the letrec RHS and OAr2 the join arity from the let body. Then
|
|
|
820
|
+we need MAr=OAr1 and MAr<=OAr2 and could simply eta-expand the RHS to match OAr2 later.
|
|
|
821
|
+MAr=OAr is the specific case where we don't want to eta-expand. Neither the join
|
|
792
|
822
|
points paper nor GHC does this at the moment.)
|
|
793
|
823
|
|
|
794
|
824
|
We can capitalise on this observation and conclude that *if* f could become a
|
|
795
|
|
-joinrec (without eta-expansion), it will have join arity M.
|
|
796
|
|
-Now, M is just the result of 'joinRhsArity', a rather simple, local analysis.
|
|
|
825
|
+joinrec (without eta-expansion), it will have join arity MAr.
|
|
|
826
|
+Now, MAr is just the result of 'joinRhsArity', a rather simple, local analysis.
|
|
797
|
827
|
It is also the join arity inside the 'TailUsageDetails' returned by
|
|
798
|
828
|
'occAnalLamTail', so we can predict join arity without doing any fixed-point
|
|
799
|
829
|
iteration or really doing any deep traversal of let body or RHS at all.
|
|
800
|
|
-We check for M in the 'adjustTailUsage' call inside 'tagRecBinders'.
|
|
|
830
|
+We check for MAr in the 'adjustTailUsage' call inside 'tagRecBinders'.
|
|
801
|
831
|
|
|
802
|
832
|
All this is quite apparent if you look at the contification transformation in
|
|
803
|
833
|
Fig. 5 of "Compiling without Continuations" (which does not account for
|
|
804
|
834
|
eta-expansion at all, mind you). The letrec case looks like this
|
|
805
|
|
-
|
|
|
835
|
+n
|
|
806
|
836
|
letrec f = /\as.\xs. L[us] in L'[es]
|
|
807
|
837
|
... and a bunch of conditions establishing that f only occurs
|
|
808
|
838
|
in app heads of join arity (len as + len xs) inside us and es ...
|
|
809
|
839
|
|
|
810
|
|
-The syntactic form `/\as.\xs. L[us]` forces M=O iff `f` occurs in `us`. However,
|
|
|
840
|
+The syntactic form `/\as.\xs. L[us]` forces MAr=OAr iff `f` occurs in `us`. However,
|
|
811
|
841
|
for non-recursive functions, this is the definition of contification from the
|
|
812
|
842
|
paper:
|
|
813
|
843
|
|
|
814
|
844
|
let f = /\as.\xs.u in L[es] ... conditions ...
|
|
815
|
845
|
|
|
816
|
|
-Note that u could be a lambda itself, as we have seen. No relationship between M
|
|
817
|
|
-and O to exploit here.
|
|
|
846
|
+Note that u could be a lambda itself, as we have seen. No relationship between MAr
|
|
|
847
|
+and OAr to exploit here.
|
|
818
|
848
|
|
|
819
|
849
|
Note [Join points and unfoldings/rules]
|
|
820
|
850
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| ... |
... |
@@ -992,23 +1022,29 @@ occAnalBind !env lvl ire (NonRec bndr rhs) thing_inside combine |
|
992
|
1022
|
= -- Analyse the RHS and /then/ the body
|
|
993
|
1023
|
let -- Analyse the rhs first, generating rhs_uds
|
|
994
|
1024
|
!(rhs_uds_s, bndr', rhs') = occAnalNonRecRhs env lvl ire mb_join bndr rhs
|
|
995
|
|
- rhs_uds = foldl1' orUDs rhs_uds_s -- NB: orUDs. See (W4) of
|
|
996
|
|
- -- Note [Occurrence analysis for join points]
|
|
|
1025
|
+ rhs_uds = foldl1' (combineJoinPointUDs env)
|
|
|
1026
|
+ rhs_uds_s -- NB: combineJoinPointUDs. See (W4) of
|
|
|
1027
|
+ -- Note [Occurrence analysis for join points]
|
|
997
|
1028
|
|
|
998
|
1029
|
-- Now analyse the body, adding the join point
|
|
999
|
1030
|
-- into the environment with addJoinPoint
|
|
1000
|
|
- !(WUD body_uds (occ, body)) = occAnalNonRecBody env bndr' $ \env ->
|
|
|
1031
|
+ env_body = addLocalLet env lvl bndr
|
|
|
1032
|
+ !(WUD body_uds (occ, body)) = occAnalNonRecBody env_body bndr' $ \env ->
|
|
1001
|
1033
|
thing_inside (addJoinPoint env bndr' rhs_uds)
|
|
1002
|
1034
|
in
|
|
1003
|
1035
|
if isDeadOcc occ -- Drop dead code; see Note [Dead code]
|
|
1004
|
1036
|
then WUD body_uds body
|
|
1005
|
|
- else WUD (rhs_uds `orUDs` body_uds) -- Note `orUDs`
|
|
|
1037
|
+ else -- pprTrace "occAnal-nonrec" (vcat [ ppr bndr <+> ppr occ
|
|
|
1038
|
+ -- , text "rhs_uds" <+> ppr rhs_uds
|
|
|
1039
|
+ -- , text "body_uds" <+> ppr body_uds ]) $
|
|
|
1040
|
+ WUD (combineJoinPointUDs env rhs_uds body_uds) -- Note `orUDs`
|
|
1006
|
1041
|
(combine [NonRec (fst (tagNonRecBinder lvl occ bndr')) rhs']
|
|
1007
|
1042
|
body)
|
|
1008
|
1043
|
|
|
1009
|
1044
|
-- The normal case, including newly-discovered join points
|
|
1010
|
1045
|
-- Analyse the body and /then/ the RHS
|
|
1011
|
|
- | WUD body_uds (occ,body) <- occAnalNonRecBody env bndr thing_inside
|
|
|
1046
|
+ | let env_body = addLocalLet env lvl bndr
|
|
|
1047
|
+ , WUD body_uds (occ,body) <- occAnalNonRecBody env_body bndr thing_inside
|
|
1012
|
1048
|
= if isDeadOcc occ -- Drop dead code; see Note [Dead code]
|
|
1013
|
1049
|
then WUD body_uds body
|
|
1014
|
1050
|
else let
|
| ... |
... |
@@ -1054,7 +1090,7 @@ occAnalNonRecRhs !env lvl imp_rule_edges mb_join bndr rhs |
|
1054
|
1090
|
rhs_ctxt = mkNonRecRhsCtxt lvl bndr unf
|
|
1055
|
1091
|
|
|
1056
|
1092
|
-- See Note [Join arity prediction based on joinRhsArity]
|
|
1057
|
|
- -- Match join arity O from mb_join_arity with manifest join arity M as
|
|
|
1093
|
+ -- Match join arity OAr from mb_join_arity with manifest join arity MAr as
|
|
1058
|
1094
|
-- returned by of occAnalLamTail. It's totally OK for them to mismatch;
|
|
1059
|
1095
|
-- hence adjust the UDs from the RHS
|
|
1060
|
1096
|
|
| ... |
... |
@@ -1764,7 +1800,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs) |
|
1764
|
1800
|
-- here because that is what we are setting!
|
|
1765
|
1801
|
WTUD unf_tuds unf' = occAnalUnfolding rhs_env unf
|
|
1766
|
1802
|
adj_unf_uds = adjustTailArity (JoinPoint rhs_ja) unf_tuds
|
|
1767
|
|
- -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source M
|
|
|
1803
|
+ -- `rhs_ja` is `joinRhsArity rhs` and is the prediction for source MAr
|
|
1768
|
1804
|
-- of Note [Join arity prediction based on joinRhsArity]
|
|
1769
|
1805
|
|
|
1770
|
1806
|
--------- IMP-RULES --------
|
| ... |
... |
@@ -1775,7 +1811,7 @@ makeNode !env imp_rule_edges bndr_set (bndr, rhs) |
|
1775
|
1811
|
|
|
1776
|
1812
|
--------- All rules --------
|
|
1777
|
1813
|
-- See Note [Join points and unfoldings/rules]
|
|
1778
|
|
- -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source M
|
|
|
1814
|
+ -- `rhs_ja` is `joinRhsArity rhs'` and is the prediction for source MAr
|
|
1779
|
1815
|
-- of Note [Join arity prediction based on joinRhsArity]
|
|
1780
|
1816
|
rules_w_uds :: [(CoreRule, UsageDetails, UsageDetails)]
|
|
1781
|
1817
|
rules_w_uds = [ (r,l,adjustTailArity (JoinPoint rhs_ja) rhs_wuds)
|
| ... |
... |
@@ -2177,7 +2213,9 @@ occAnalLamTail :: OccEnv -> CoreExpr -> WithTailUsageDetails CoreExpr |
|
2177
|
2213
|
-- See Note [Adjusting right-hand sides]
|
|
2178
|
2214
|
occAnalLamTail env expr
|
|
2179
|
2215
|
= let !(WUD usage expr') = occ_anal_lam_tail env expr
|
|
2180
|
|
- in WTUD (TUD (joinRhsArity expr) usage) expr'
|
|
|
2216
|
+ in WTUD (TUD (joinRhsArity expr') usage) expr'
|
|
|
2217
|
+ -- If expr looks like (\x. let dead = e in \y. blah), where `dead` is dead
|
|
|
2218
|
+ -- then joinRhsArity expr' might exceed joinRhsArity expr
|
|
2181
|
2219
|
|
|
2182
|
2220
|
occ_anal_lam_tail :: OccEnv -> CoreExpr -> WithUsageDetails CoreExpr
|
|
2183
|
2221
|
-- Does not markInsideLam etc for the outmost batch of lambdas
|
| ... |
... |
@@ -2598,7 +2636,7 @@ occAnalArgs :: OccEnv -> CoreExpr -> [CoreExpr] |
|
2598
|
2636
|
-> WithUsageDetails CoreExpr
|
|
2599
|
2637
|
-- The `fun` argument is just an accumulating parameter,
|
|
2600
|
2638
|
-- the base for building the application we return
|
|
2601
|
|
-occAnalArgs !env fun args !one_shots
|
|
|
2639
|
+occAnalArgs env fun args one_shots
|
|
2602
|
2640
|
= go emptyDetails fun args one_shots
|
|
2603
|
2641
|
where
|
|
2604
|
2642
|
env_args = setNonTailCtxt encl env
|
| ... |
... |
@@ -2657,8 +2695,19 @@ Constructors are rather like lambdas in this way. |
|
2657
|
2695
|
occAnalApp :: OccEnv
|
|
2658
|
2696
|
-> (Expr CoreBndr, [Arg CoreBndr], [CoreTickish])
|
|
2659
|
2697
|
-> WithUsageDetails (Expr CoreBndr)
|
|
2660
|
|
--- Naked variables (not applied) end up here too
|
|
2661
|
|
-occAnalApp !env (Var fun, args, ticks)
|
|
|
2698
|
+occAnalApp !env (Var fun_id, [], ticks)
|
|
|
2699
|
+ = -- Naked variables (not applied) end up here too, and it's worth giving
|
|
|
2700
|
+ -- this common case special treatment, because there is so much less to do.
|
|
|
2701
|
+ -- This is just a specialised copy of the (Var fun_id) case below
|
|
|
2702
|
+ WUD fun_uds (mkTicks ticks fun')
|
|
|
2703
|
+ where
|
|
|
2704
|
+ !(fun', fun_id') = lookupBndrSwap env fun_id
|
|
|
2705
|
+ !fun_uds = mkOneOcc env fun_id' int_cxt 0
|
|
|
2706
|
+ !int_cxt = case occ_encl env of
|
|
|
2707
|
+ OccScrut -> IsInteresting
|
|
|
2708
|
+ _other -> NotInteresting
|
|
|
2709
|
+
|
|
|
2710
|
+occAnalApp env (Var fun, args, ticks)
|
|
2662
|
2711
|
-- Account for join arity of runRW# continuation
|
|
2663
|
2712
|
-- See Note [Simplification of runRW#]
|
|
2664
|
2713
|
--
|
| ... |
... |
@@ -2863,7 +2912,11 @@ data OccEnv |
|
2863
|
2912
|
-- Invariant: no Id maps to an empty OccInfoEnv
|
|
2864
|
2913
|
-- See Note [Occurrence analysis for join points]
|
|
2865
|
2914
|
, occ_join_points :: !JoinPointInfo
|
|
2866
|
|
- }
|
|
|
2915
|
+
|
|
|
2916
|
+ , occ_local_lets :: IdSet -- Non-top-level non-rec-bound lets
|
|
|
2917
|
+ -- I tried making this field strict, but
|
|
|
2918
|
+ -- doing so slightly increased allocation
|
|
|
2919
|
+ }
|
|
2867
|
2920
|
|
|
2868
|
2921
|
type JoinPointInfo = IdEnv OccInfoEnv
|
|
2869
|
2922
|
|
| ... |
... |
@@ -2914,7 +2967,8 @@ initOccEnv |
|
2914
|
2967
|
|
|
2915
|
2968
|
, occ_join_points = emptyVarEnv
|
|
2916
|
2969
|
, occ_bs_env = emptyVarEnv
|
|
2917
|
|
- , occ_bs_rng = emptyVarSet }
|
|
|
2970
|
+ , occ_bs_rng = emptyVarSet
|
|
|
2971
|
+ , occ_local_lets = emptyVarSet }
|
|
2918
|
2972
|
|
|
2919
|
2973
|
noBinderSwaps :: OccEnv -> Bool
|
|
2920
|
2974
|
noBinderSwaps (OccEnv { occ_bs_env = bs_env }) = isEmptyVarEnv bs_env
|
| ... |
... |
@@ -3154,23 +3208,26 @@ postprocess_uds bndrs bad_joins uds |
|
3154
|
3208
|
| uniq `elemVarEnvByKey` env = plusVarEnv_C andLocalOcc env join_env
|
|
3155
|
3209
|
| otherwise = env
|
|
3156
|
3210
|
|
|
|
3211
|
+addLocalLet :: OccEnv -> TopLevelFlag -> Id -> OccEnv
|
|
|
3212
|
+addLocalLet env@(OccEnv { occ_local_lets = ids }) top_lvl id
|
|
|
3213
|
+ | isTopLevel top_lvl = env
|
|
|
3214
|
+ | otherwise = env { occ_local_lets = ids `extendVarSet` id }
|
|
|
3215
|
+
|
|
3157
|
3216
|
addJoinPoint :: OccEnv -> Id -> UsageDetails -> OccEnv
|
|
3158
|
|
-addJoinPoint env bndr rhs_uds
|
|
|
3217
|
+addJoinPoint env@(OccEnv { occ_join_points = join_points, occ_local_lets = local_lets })
|
|
|
3218
|
+ join_bndr (UD { ud_env = rhs_occs })
|
|
3159
|
3219
|
| isEmptyVarEnv zeroed_form
|
|
3160
|
3220
|
= env
|
|
3161
|
3221
|
| otherwise
|
|
3162
|
|
- = env { occ_join_points = extendVarEnv (occ_join_points env) bndr zeroed_form }
|
|
|
3222
|
+ = env { occ_join_points = extendVarEnv join_points join_bndr zeroed_form }
|
|
3163
|
3223
|
where
|
|
3164
|
|
- zeroed_form = mkZeroedForm rhs_uds
|
|
|
3224
|
+ zeroed_form = mapMaybeUniqSetToUFM do_one local_lets
|
|
|
3225
|
+ -- See Note [Occurrence analysis for join points] for "zeroed form"
|
|
3165
|
3226
|
|
|
3166
|
|
-mkZeroedForm :: UsageDetails -> OccInfoEnv
|
|
3167
|
|
--- See Note [Occurrence analysis for join points] for "zeroed form"
|
|
3168
|
|
-mkZeroedForm (UD { ud_env = rhs_occs })
|
|
3169
|
|
- = mapMaybeUFM do_one rhs_occs
|
|
3170
|
|
- where
|
|
3171
|
|
- do_one :: LocalOcc -> Maybe LocalOcc
|
|
3172
|
|
- do_one (ManyOccL {}) = Nothing
|
|
3173
|
|
- do_one occ@(OneOccL {}) = Just (occ { lo_n_br = 0 })
|
|
|
3227
|
+ do_one :: Var -> Maybe LocalOcc
|
|
|
3228
|
+ do_one bndr = case lookupVarEnv rhs_occs bndr of
|
|
|
3229
|
+ Just occ@(OneOccL {}) -> Just (occ { lo_n_br = 0 })
|
|
|
3230
|
+ _ -> Nothing
|
|
3174
|
3231
|
|
|
3175
|
3232
|
--------------------
|
|
3176
|
3233
|
transClosureFV :: VarEnv VarSet -> VarEnv VarSet
|
| ... |
... |
@@ -3628,7 +3685,12 @@ data LocalOcc -- See Note [LocalOcc] |
|
3628
|
3685
|
-- Combining (AlwaysTailCalled 2) and (AlwaysTailCalled 3)
|
|
3629
|
3686
|
-- gives NoTailCallInfo
|
|
3630
|
3687
|
, lo_int_cxt :: !InterestingCxt }
|
|
|
3688
|
+
|
|
3631
|
3689
|
| ManyOccL !TailCallInfo
|
|
|
3690
|
+ -- Why do we need TailCallInfo on ManyOccL?
|
|
|
3691
|
+ -- Answer: recursive bindings are entered many times:
|
|
|
3692
|
+ -- rec { j x = ...j x'... } in j y
|
|
|
3693
|
+ -- See the uses of `andUDs` in `tagRecBinders`
|
|
3632
|
3694
|
|
|
3633
|
3695
|
instance Outputable LocalOcc where
|
|
3634
|
3696
|
ppr (OneOccL { lo_n_br = n, lo_tail = tci })
|
| ... |
... |
@@ -3663,7 +3725,7 @@ instance Outputable UsageDetails where |
|
3663
|
3725
|
-- | TailUsageDetails captures the result of applying 'occAnalLamTail'
|
|
3664
|
3726
|
-- to a function `\xyz.body`. The TailUsageDetails pairs together
|
|
3665
|
3727
|
-- * the number of lambdas (including type lambdas: a JoinArity)
|
|
3666
|
|
--- * UsageDetails for the `body` of the lambda, unadjusted by `adjustTailUsage`.
|
|
|
3728
|
+-- * UsageDetails for the `body` of the lambda, /unadjusted/ by `adjustTailUsage`.
|
|
3667
|
3729
|
-- If the binding turns out to be a join point with the indicated join
|
|
3668
|
3730
|
-- arity, this unadjusted usage details is just what we need; otherwise we
|
|
3669
|
3731
|
-- need to discard tail calls. That's what `adjustTailUsage` does.
|
| ... |
... |
@@ -3681,8 +3743,17 @@ data WithTailUsageDetails a = WTUD !TailUsageDetails !a |
|
3681
|
3743
|
|
|
3682
|
3744
|
andUDs:: UsageDetails -> UsageDetails -> UsageDetails
|
|
3683
|
3745
|
orUDs :: UsageDetails -> UsageDetails -> UsageDetails
|
|
3684
|
|
-andUDs = combineUsageDetailsWith andLocalOcc
|
|
3685
|
|
-orUDs = combineUsageDetailsWith orLocalOcc
|
|
|
3746
|
+andUDs = combineUsageDetailsWith (\_uniq -> andLocalOcc)
|
|
|
3747
|
+orUDs = combineUsageDetailsWith (\_uniq -> orLocalOcc)
|
|
|
3748
|
+
|
|
|
3749
|
+combineJoinPointUDs :: OccEnv -> UsageDetails -> UsageDetails -> UsageDetails
|
|
|
3750
|
+-- See (W5) in Note [Occurrence analysis for join points]
|
|
|
3751
|
+combineJoinPointUDs (OccEnv { occ_local_lets = local_lets }) uds1 uds2
|
|
|
3752
|
+ = combineUsageDetailsWith combine uds1 uds2
|
|
|
3753
|
+ where
|
|
|
3754
|
+ combine uniq occ1 occ2
|
|
|
3755
|
+ | uniq `elemVarSetByKey` local_lets = orLocalOcc occ1 occ2
|
|
|
3756
|
+ | otherwise = andLocalOcc occ1 occ2
|
|
3686
|
3757
|
|
|
3687
|
3758
|
mkOneOcc :: OccEnv -> Id -> InterestingCxt -> JoinArity -> UsageDetails
|
|
3688
|
3759
|
mkOneOcc !env id int_cxt arity
|
| ... |
... |
@@ -3699,7 +3770,8 @@ mkOneOcc !env id int_cxt arity |
|
3699
|
3770
|
= mkSimpleDetails (unitVarEnv id occ)
|
|
3700
|
3771
|
|
|
3701
|
3772
|
where
|
|
3702
|
|
- occ = OneOccL { lo_n_br = 1, lo_int_cxt = int_cxt
|
|
|
3773
|
+ occ = OneOccL { lo_n_br = 1
|
|
|
3774
|
+ , lo_int_cxt = int_cxt
|
|
3703
|
3775
|
, lo_tail = AlwaysTailCalled arity }
|
|
3704
|
3776
|
|
|
3705
|
3777
|
-- Add several occurrences, assumed not to be tail calls
|
| ... |
... |
@@ -3786,7 +3858,7 @@ restrictFreeVars bndrs fvs = restrictUniqSetToUFM bndrs fvs |
|
3786
|
3858
|
-------------------
|
|
3787
|
3859
|
-- Auxiliary functions for UsageDetails implementation
|
|
3788
|
3860
|
|
|
3789
|
|
-combineUsageDetailsWith :: (LocalOcc -> LocalOcc -> LocalOcc)
|
|
|
3861
|
+combineUsageDetailsWith :: (Unique -> LocalOcc -> LocalOcc -> LocalOcc)
|
|
3790
|
3862
|
-> UsageDetails -> UsageDetails -> UsageDetails
|
|
3791
|
3863
|
{-# INLINE combineUsageDetailsWith #-}
|
|
3792
|
3864
|
combineUsageDetailsWith plus_occ_info
|
| ... |
... |
@@ -3796,9 +3868,9 @@ combineUsageDetailsWith plus_occ_info |
|
3796
|
3868
|
| isEmptyVarEnv env2 = uds1
|
|
3797
|
3869
|
| otherwise
|
|
3798
|
3870
|
-- See Note [Strictness in the occurrence analyser]
|
|
3799
|
|
- -- Using strictPlusVarEnv here speeds up the test T26425 by about 10% by avoiding
|
|
3800
|
|
- -- intermediate thunks.
|
|
3801
|
|
- = UD { ud_env = strictPlusVarEnv_C plus_occ_info env1 env2
|
|
|
3871
|
+ -- Using strictPlusVarEnv here speeds up the test T26425
|
|
|
3872
|
+ -- by about 10% by avoiding intermediate thunks.
|
|
|
3873
|
+ = UD { ud_env = strictPlusVarEnv_C_Directly plus_occ_info env1 env2
|
|
3802
|
3874
|
, ud_z_many = strictPlusVarEnv z_many1 z_many2
|
|
3803
|
3875
|
, ud_z_in_lam = plusVarEnv z_in_lam1 z_in_lam2
|
|
3804
|
3876
|
, ud_z_tail = strictPlusVarEnv z_tail1 z_tail2 }
|
| ... |
... |
@@ -3842,8 +3914,6 @@ lookupOccInfoByUnique (UD { ud_env = env |
|
3842
|
3914
|
| uniq `elemVarEnvByKey` z_tail = NoTailCallInfo
|
|
3843
|
3915
|
| otherwise = ti
|
|
3844
|
3916
|
|
|
3845
|
|
-
|
|
3846
|
|
-
|
|
3847
|
3917
|
-------------------
|
|
3848
|
3918
|
-- See Note [Adjusting right-hand sides]
|
|
3849
|
3919
|
|
| ... |
... |
@@ -3853,21 +3923,22 @@ adjustNonRecRhs :: JoinPointHood |
|
3853
|
3923
|
-- ^ This function concentrates shared logic between occAnalNonRecBind and the
|
|
3854
|
3924
|
-- AcyclicSCC case of occAnalRec.
|
|
3855
|
3925
|
-- It returns the adjusted rhs UsageDetails combined with the body usage
|
|
3856
|
|
-adjustNonRecRhs mb_join_arity rhs_wuds@(WTUD _ rhs)
|
|
3857
|
|
- = WUD (adjustTailUsage mb_join_arity rhs_wuds) rhs
|
|
3858
|
|
-
|
|
|
3926
|
+adjustNonRecRhs mb_join_arity (WTUD (TUD rhs_ja uds) rhs)
|
|
|
3927
|
+ = WUD (adjustTailUsage exact_join rhs uds) rhs
|
|
|
3928
|
+ where
|
|
|
3929
|
+ exact_join = mb_join_arity == JoinPoint rhs_ja
|
|
3859
|
3930
|
|
|
3860
|
|
-adjustTailUsage :: JoinPointHood
|
|
3861
|
|
- -> WithTailUsageDetails CoreExpr -- Rhs usage, AFTER occAnalLamTail
|
|
|
3931
|
+adjustTailUsage :: Bool -- True <=> Exactly-matching join point; don't do markNonTail
|
|
|
3932
|
+ -> CoreExpr -- Rhs usage, AFTER occAnalLamTail
|
|
3862
|
3933
|
-> UsageDetails
|
|
3863
|
|
-adjustTailUsage mb_join_arity (WTUD (TUD rhs_ja uds) rhs)
|
|
|
3934
|
+ -> UsageDetails
|
|
|
3935
|
+adjustTailUsage exact_join rhs uds
|
|
3864
|
3936
|
= -- c.f. occAnal (Lam {})
|
|
3865
|
3937
|
markAllInsideLamIf (not one_shot) $
|
|
3866
|
3938
|
markAllNonTailIf (not exact_join) $
|
|
3867
|
3939
|
uds
|
|
3868
|
3940
|
where
|
|
3869
|
3941
|
one_shot = isOneShotFun rhs
|
|
3870
|
|
- exact_join = mb_join_arity == JoinPoint rhs_ja
|
|
3871
|
3942
|
|
|
3872
|
3943
|
adjustTailArity :: JoinPointHood -> TailUsageDetails -> UsageDetails
|
|
3873
|
3944
|
adjustTailArity mb_rhs_ja (TUD ja usage)
|
| ... |
... |
@@ -3914,8 +3985,9 @@ tagNonRecBinder lvl occ bndr |
|
3914
|
3985
|
tagRecBinders :: TopLevelFlag -- At top level?
|
|
3915
|
3986
|
-> UsageDetails -- Of body of let ONLY
|
|
3916
|
3987
|
-> [NodeDetails]
|
|
3917
|
|
- -> WithUsageDetails -- Adjusted details for whole scope,
|
|
3918
|
|
- -- with binders removed
|
|
|
3988
|
+ -> WithUsageDetails -- Adjusted details for whole scope
|
|
|
3989
|
+ -- still including the binders;
|
|
|
3990
|
+ -- (they are removed by `addInScope`)
|
|
3919
|
3991
|
[IdWithOccInfo] -- Tagged binders
|
|
3920
|
3992
|
-- Substantially more complicated than non-recursive case. Need to adjust RHS
|
|
3921
|
3993
|
-- details *before* tagging binders (because the tags depend on the RHSes).
|
| ... |
... |
@@ -3925,32 +3997,21 @@ tagRecBinders lvl body_uds details_s |
|
3925
|
3997
|
|
|
3926
|
3998
|
-- 1. See Note [Join arity prediction based on joinRhsArity]
|
|
3927
|
3999
|
-- Determine possible join-point-hood of whole group, by testing for
|
|
3928
|
|
- -- manifest join arity M.
|
|
3929
|
|
- -- This (re-)asserts that makeNode had made tuds for that same arity M!
|
|
|
4000
|
+ -- manifest join arity MAr.
|
|
|
4001
|
+ -- This (re-)asserts that makeNode had made tuds for that same arity MAr!
|
|
3930
|
4002
|
unadj_uds = foldr (andUDs . test_manifest_arity) body_uds details_s
|
|
3931
|
|
- test_manifest_arity ND{nd_rhs = WTUD tuds rhs}
|
|
3932
|
|
- = adjustTailArity (JoinPoint (joinRhsArity rhs)) tuds
|
|
|
4003
|
+ test_manifest_arity ND{nd_rhs = WTUD (TUD rhs_ja uds) rhs}
|
|
|
4004
|
+ = assertPpr (rhs_ja == joinRhsArity rhs) (ppr rhs_ja $$ ppr uds $$ ppr rhs) $
|
|
|
4005
|
+ uds
|
|
3933
|
4006
|
|
|
|
4007
|
+ will_be_joins :: Bool
|
|
3934
|
4008
|
will_be_joins = decideRecJoinPointHood lvl unadj_uds bndrs
|
|
3935
|
4009
|
|
|
3936
|
|
- mb_join_arity :: Id -> JoinPointHood
|
|
3937
|
|
- -- mb_join_arity: See Note [Join arity prediction based on joinRhsArity]
|
|
3938
|
|
- -- This is the source O
|
|
3939
|
|
- mb_join_arity bndr
|
|
3940
|
|
- -- Can't use willBeJoinId_maybe here because we haven't tagged
|
|
3941
|
|
- -- the binder yet (the tag depends on these adjustments!)
|
|
3942
|
|
- | will_be_joins
|
|
3943
|
|
- , AlwaysTailCalled arity <- lookupTailCallInfo unadj_uds bndr
|
|
3944
|
|
- = JoinPoint arity
|
|
3945
|
|
- | otherwise
|
|
3946
|
|
- = assert (not will_be_joins) -- Should be AlwaysTailCalled if
|
|
3947
|
|
- NotJoinPoint -- we are making join points!
|
|
3948
|
|
-
|
|
3949
|
4010
|
-- 2. Adjust usage details of each RHS, taking into account the
|
|
3950
|
4011
|
-- join-point-hood decision
|
|
3951
|
|
- rhs_udss' = [ adjustTailUsage (mb_join_arity bndr) rhs_wuds
|
|
|
4012
|
+ rhs_udss' = [ adjustTailUsage will_be_joins rhs rhs_uds
|
|
3952
|
4013
|
-- Matching occAnalLamTail in makeNode
|
|
3953
|
|
- | ND { nd_bndr = bndr, nd_rhs = rhs_wuds } <- details_s ]
|
|
|
4014
|
+ | ND { nd_rhs = WTUD (TUD _ rhs_uds) rhs } <- details_s ]
|
|
3954
|
4015
|
|
|
3955
|
4016
|
-- 3. Compute final usage details from adjusted RHS details
|
|
3956
|
4017
|
adj_uds = foldr andUDs body_uds rhs_udss'
|
| ... |
... |
@@ -3969,9 +4030,9 @@ setBinderOcc occ_info bndr |
|
3969
|
4030
|
| otherwise = setIdOccInfo bndr occ_info
|
|
3970
|
4031
|
|
|
3971
|
4032
|
-- | Decide whether some bindings should be made into join points or not, based
|
|
3972
|
|
--- on its occurrences. This is
|
|
|
4033
|
+-- on its occurrences.
|
|
3973
|
4034
|
-- Returns `False` if they can't be join points. Note that it's an
|
|
|
4035
|
+-- all-or-nothing decision: if multiple binders are given, they are
|
|
3974
|
4036
|
-- assumed to be mutually recursive.
|
|
3975
|
4037
|
--
|
|
3976
|
4038
|
-- It must, however, be a final decision. If we say `True` for 'f',
|