
Once I actually add a 'dist_fast_inline_caller', that indirection disappears in the inlined code, just as it does for dist_fast itself.
dist_fast_inlined_caller :: UArr Double -> UArr Double -> Bool dist_fast_inlined_caller p1 p2 = dist_fast_inlined p1 p2 > 2
However, in the simpl output for 'dist_fast_inline_caller', the 'sumU' and 'zipWithU' still don't seem to be fused - Don?
All the 'seq's and so on should be unnecessary, and even so, I still get the expected fusion:
As I said, I don't get the fusion if I just add the function above to the original Dist.hs, export it and compile the module with '-c -O2 -ddump-simpl': Dist.dist_fast_inlined_caller = \ (w1_s1nb :: Data.Array.Vector.UArr.UArr GHC.Types.Double) (w2_s1nc :: Data.Array.Vector.UArr.UArr GHC.Types.Double) -> case (Dist.$wzipWithU Dist.lvl2 w1_s1nb w2_s1nc) `cast` (trans Data.Array.Vector.UArr.TFCo:R56:UArr Data.Array.Vector.UArr.NTCo:R56:UArr :: Data.Array.Vector.UArr.UArr GHC.Types.Double ~ Data.Array.Vector.Prim.BUArr.BUArr GHC.Types.Double) of _ { Data.Array.Vector.Prim.BUArr.BUArr ipv_s1lb ipv1_s1lc ipv2_s1ld -> letrec { $wfold_s1nN :: GHC.Prim.Double# -> GHC.Prim.Int# -> GHC.Prim.Double# LclId [Arity 2 Str: DmdType LL] $wfold_s1nN = \ (ww_s1mZ :: GHC.Prim.Double#) (ww1_s1n3 :: GHC.Prim.Int#) -> case GHC.Prim.==# ww1_s1n3 ipv1_s1lc of _ { GHC.Bool.False -> $wfold_s1nN (GHC.Prim.+## ww_s1mZ (GHC.Prim.indexDoubleArray# ipv2_s1ld (GHC.Prim.+# ipv_s1lb ww1_s1n3))) (GHC.Prim.+# ww1_s1n3 1); GHC.Bool.True -> ww_s1mZ }; } in case $wfold_s1nN 0.0 0 of ww_s1n7 { __DEFAULT -> GHC.Prim.>## (GHC.Prim.sqrtDouble# ww_s1n7) 2.0 } } Claus