List Fusion of concatMap

Hi, in further attempts to get a better understanding of list fusion, I am investigating under what conditions, calls to concat and concatMap succesfully fusion away. Here is my test code: import System.Exit func0 :: Int -> Bool func0 n = any (>5) [1..n] {-# NOINLINE func0 #-} func1 :: Int -> Bool func1 n = any (>5) (concatMap (\m -> [1..m]) [1..n]) {-# NOINLINE func1 #-} func2 :: Int -> Bool func2 n = any (>5) (concat (map (\m -> [1..m]) [1..n])) {-# NOINLINE func2 #-} func3 :: Int -> Bool func3 n = any (>5) [ i | m <- [1..n] , i <- [1..m]] {-# NOINLINE func3 #-} func4 :: Int -> Bool func4 n = any (>5) ( let ok m = [ i | i <- [1..m] ] in concatMap ok [1..n] ) {-# NOINLINE func4 #-} main = if func0 10 && func1 10 && func2 10 && func3 10 && func4 10 then exitSuccess else exitFailure I deliberately did not use any putStr statements in main, so that I can easily spot list types in the output of ghc -fext-core. The use of "any" and "[...]" represent any fusion-enabled producers or consumers; these are just very well-behaving and easy to spot in the core output. Here are my obeservations: * The code generated for func0 does not use any lists at all, but rather one recursive (even tail-recursive) function, as expected. This was basically a test whether I am indeed able to observe list fusion. * func1 and func2 are not fused successfully; both core outputs still contain mentions of [] (search for ZMZN; is there a tool that does this unescaping for me? ghc-core does not, it seems.) * Now the surprising fact: func3 does without any lists! The code generated contains two nested recursive functions, the inner one even tail recursive (and the outer one could have been made tail-recursive, it seems, but that would be a different issue). * func4, which is a direct one-step translation of func3 according to the semantic rules of list comprehension in the haskell report, again is not fused completely. Now I wonder: How do I need to phrase the function with concatMap such that fusion works? Does ghc treat list comprehensions differently here? Do the rewrite rules need work so that func1 and func2 do fuse correctly? Thanks for your attention, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

Hi, Am Donnerstag, den 01.12.2011, 21:44 +0100 schrieb Joachim Breitner:
Does ghc treat list comprehensions differently here?
I could answer this by looking at compiler/deSugar/DsListComp.lhs in the GHC source: List comprehensions may be desugared in one of two ways: ``ordinary'' (as you would expect if you read SLPJ's book) and ``with foldr/build turned on'' (if you read Gill {\em et al.}'s paper on the subject). (and indeed the translation depends on the flags) and later on: @dfListComp@ are the rules used with foldr/build turned on: TE[ e | ] c n = c e n TE[ e | b , q ] c n = if b then TE[ e | q ] c n else n TE[ e | p <- l , q ] c n = let f = \ x b -> case x of p -> TE[ e | q ] c b _ -> b in foldr f n l So I could manually rewrite func3 to: func4 :: Int -> Bool func4 k = any (>5) (build (\c n -> foldr (\x b -> case x of m -> foldr (\x' b' -> case x' of i -> c i b' ) b [1..m] ) n [1..k] )) {-# NOINLINE func4 #-} and get identical core output. Having a case expression does not matter in this case, because the code does all calculations completely with unboxed integers anyways, so this can be written as follows, with still identical core: func5 :: Int -> Bool func5 k = any (>5) (build (\c n -> foldr (\x b -> foldr c b [1..x]) n [1..k] )) {-# NOINLINE func5 #-} This would motivate the following definition for a fusionable concatMap, going via list comprehensions and their translation to ideal list fusion consumers/producers: concatMap f xs == [ y | x <- xs, y <- f x ] == build (\c n -> foldr (\x b -> foldr c b (f x)) n xs) And indeed, adding {-# RULES "myConcatMap" forall f xs . concatMap f xs = build (\c n -> foldr (\x b -> foldr c b (f x)) n xs) #-} to my file finally makes func1 behave the way I want it to, i.e. exactly the same core as the list comprehension variant, and no lists at all, only unboxed integers. Now I guess there is a reason why concatMap is not defined this way. But what is it? Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/

Hi, Am Donnerstag, den 01.12.2011, 22:16 +0100 schrieb Joachim Breitner:
This would motivate the following definition for a fusionable concatMap, going via list comprehensions and their translation to ideal list fusion consumers/producers:
concatMap f xs == [ y | x <- xs, y <- f x ] == build (\c n -> foldr (\x b -> foldr c b (f x)) n xs)
And indeed, adding {-# RULES "myConcatMap" forall f xs . concatMap f xs = build (\c n -> foldr (\x b -> foldr c b (f x)) n xs) #-}
to my file finally makes func1 behave the way I want it to, i.e. exactly the same core as the list comprehension variant, and no lists at all, only unboxed integers.
Now I guess there is a reason why concatMap is not defined this way. But what is it?
I further tired to investigate where func2 (using "concat (map ..)") goes wrong, after all, we have this rule for concat: forall xs. concat xs = build (\c n -> foldr (\x y -> foldr c y x) n xs) which is pretty close to what I am proposing for concatMap above. I used "ghc -dverbose-core2core -ddump-rule-firings", but it seems that this output is not complete; e.g. at some point, RULES "eftInt" [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y) fires, but the output does not mention it. Anyway, I tried to reconstruct what is happening in better readable terms: We begin with func2 as given: func2 k = any (>5) (concat (map (\m -> [1..m]) [1..k])) rule "map" func2 k = any (>5) (concat (build (\c n -> foldr (mapFB c (\m -> [1..m])) n [1..k]))) rule "concat" func2 k = any (>5) (build (\c n -> foldr (\x y -> foldr c y x) n (build (\c n -> foldr (mapFB c (\m -> [1..m])) n [1..k])))) rule "fold/build" func2 k = any (>5) (build (\c n -> (\c n -> foldr (mapFB c (\m -> [1..m])) n [1..k]) (\x y -> foldr c y x) n)) rule "any/build" func2 k = (\c n -> (\c n -> foldr (mapFB c (\m -> [1..m])) n [1..k]) (\x y -> foldr c y x) n) ((||) . (>5)) False rule "eftInt" func2 k = (\c n -> (\c n -> foldr (mapFB c (\m -> [1..m])) n (build (\c n -> eftIntFB c n 1 k))) (\x y -> foldr c y x) k) ((||) . (>5)) False rule "fold/build" func2 k = (\c n -> (\c n -> (\c n -> eftIntFB c n 1 k) (mapFB c (\m -> [1..m])) n) (\x y -> foldr c y x) n) ((||) . (>5)) False beta-reduction func2 k = eftIntFB (mapFB (\x y -> foldr ((||) . (>5)) y x) (\m -> [1..m])) False 1 k At this point, if the definition of mapFB would be inlined, we could continue successfully as follows. This is not what is happening, but I am not sure why: func2 k = eftIntFB (\x ys -> (\x y -> foldr ((||) . (>5)) y x) ((\m -> [1..m]) x) ys) False 1 k beta-reduction func2 k = eftIntFB (\m ys -> (foldr ((||) . (>5)) ys [1..m])) False 1 k rule "eftInt" func2 k = eftIntFB (\m ys -> (foldr ((||) . (>5)) ys (build (\c n -> eftIntFB c n 1 m)))) False 1 k rule "fold/build" func2 k = eftIntFB (\m ys -> (eftIntFB ((||) . (>5)) ys 1 m)) False 1 k completely deforested code. What do you think? Can the list fusion rules be improved so that they can catch these cases as well? Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/
participants (1)
-
Joachim Breitner