
On Friday 14 October 2011, 02:00:39, Bas van Dijk wrote:
Hello,
Is there any reason other than history that foldl and foldl' are not defined in terms of foldr?
Worst-case efficiency.
If we define them in terms of foldr like:
foldl f z xs = foldr (\x y -> \z' -> let z'' = z' `f` x in y z'') id xs z {-# INLINE foldl #-}
foldl' f z xs = foldr (\x y -> \z' -> let !z'' = z' `f` x in y z'') id xs z {-# INLINE foldl' #-}
we can benefit from list fusion.
But if it doesn't happen, we get terrible code.
For example if we define sum as:
sum :: Num a => [a] -> a sum = foldl (+) 0
then building the following program with -O2:
fuse = sum (replicate 1000000 1 ++ replicate 5000 1 :: [Int])
yields the following totally fused core:
fuse :: Int fuse = case $wxs 1000000 0 of ww_ssn { __DEFAULT -> I# ww_ssn }
$wxs :: Int# -> Int# -> Int# $wxs = \ (w_ssg :: Int#) (ww_ssj :: Int#) -> case <=# w_ssg 1 of _ { False -> $wxs (-# w_ssg 1) (+# ww_ssj 1); True -> $wxs1_rsB 5000 (+# ww_ssj 1) }
$wxs1_rsB :: Int# -> Int# -> Int# $wxs1_rsB = \ (w_ss5 :: Int#) (ww_ss8 :: Int#) -> case <=# w_ss5 1 of _ { False -> $wxs1_rsB (-# w_ss5 1) (+# ww_ss8 1); True -> +# ww_ss8 1 }
If you try something less transparent than replicate and use a less convenient type than Int, you get core like =================================== FuseL.fuse_wild :: GHC.Integer.Type.Integer [GblId, Caf=NoCafRefs, Str=DmdType, Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=True, ConLike=True, Cheap=True, Expandable=True, Guidance=IF_ARGS [] 10 110}] FuseL.fuse_wild = GHC.Integer.Type.S# 1 Rec { go_rwH :: GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer [GblId, Arity=1, Caf=NoCafRefs, Str=DmdType S] go_rwH = \ (x_avj :: GHC.Integer.Type.Integer) -> let { y_abw [Dmd=Just L] :: GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer [LclId, Str=DmdType] y_abw = go_rwH (GHC.Integer.plusInteger x_avj FuseL.fuse_wild) } in case x_avj of wild_dvr { GHC.Integer.Type.S# i_dvt -> case GHC.Prim.># i_dvt 5000 of _ { GHC.Types.False -> \ (z'_abx :: GHC.Integer.Type.Integer) -> y_abw (GHC.Integer.plusInteger z'_abx wild_dvr); GHC.Types.True -> GHC.Base.id @ GHC.Integer.Type.Integer }; GHC.Integer.Type.J# s_dvG d_dvH -> case {__pkg_ccall_GC integer-gmp integer_cmm_cmpIntegerIntzh GHC.Prim.Int# -> GHC.Prim.ByteArray# -> GHC.Prim.Int# -> GHC.Prim.Int#}_dvL s_dvG d_dvH 5000 of wild2_dvO { __DEFAULT -> case GHC.Prim.># wild2_dvO 0 of _ { GHC.Types.False -> \ (z'_abx :: GHC.Integer.Type.Integer) -> y_abw (GHC.Integer.plusInteger z'_abx wild_dvr); GHC.Types.True -> GHC.Base.id @ GHC.Integer.Type.Integer } } } end Rec } n_rwI :: GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer [GblId, Str=DmdType] n_rwI = go_rwH FuseL.fuse_wild Rec { FuseL.fuse_go [Occ=LoopBreaker] :: GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer [GblId, Arity=1, Str=DmdType S] FuseL.fuse_go = \ (x_avj :: GHC.Integer.Type.Integer) -> let { y_abw [Dmd=Just L] :: GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer [LclId, Str=DmdType] y_abw = FuseL.fuse_go (GHC.Integer.plusInteger x_avj FuseL.fuse_wild) } in case x_avj of wild_dvr { GHC.Integer.Type.S# i_dvt -> case GHC.Prim.># i_dvt 1000000 of _ { GHC.Types.False -> \ (z'_abx :: GHC.Integer.Type.Integer) -> y_abw (GHC.Integer.plusInteger z'_abx wild_dvr); GHC.Types.True -> n_rwI }; GHC.Integer.Type.J# s_dvG d_dvH -> case {__pkg_ccall_GC integer-gmp integer_cmm_cmpIntegerIntzh GHC.Prim.Int# -> GHC.Prim.ByteArray# -> GHC.Prim.Int# -> GHC.Prim.Int#}_dvL s_dvG d_dvH 1000000 of wild2_dvO { __DEFAULT -> case GHC.Prim.># wild2_dvO 0 of _ { GHC.Types.False -> \ (z'_abx :: GHC.Integer.Type.Integer) -> y_abw (GHC.Integer.plusInteger z'_abx wild_dvr); GHC.Types.True -> n_rwI } } } end Rec } FuseL.fuse2 :: GHC.Integer.Type.Integer -> GHC.Integer.Type.Integer [GblId, Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False, ConLike=False, Cheap=False, Expandable=False, Guidance=IF_ARGS [] 20 0}] FuseL.fuse2 = FuseL.fuse_go FuseL.fuse_wild FuseL.fuse :: GHC.Integer.Type.Integer [GblId, Str=DmdType, Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False, ConLike=False, Cheap=False, Expandable=False, Guidance=IF_ARGS [] 20 0}] FuseL.fuse = FuseL.fuse2 FuseL.fuse1 =================================== for fuse = sum ([1 .. 1000000] ++ [1 .. 5000] :: [Integer]) You do NOT want that. Incidentally, the core generated for foldl and foldl' as defined above is fine, it's the inlining and failure to fuse well that wreaks havoc.