
On Wednesday 20 October 2010 01:23:49, Bastian Erdnüß wrote:
Suppose I would want to write a function rmap that acts like
rmap [f,g,h] x == [f x, g x, h x]
Do I gain any advantage or disadvantage (beside readability) from using
rmap = flip $ map . flip id
over
rmap [] _ = [] rmap (f:fs) x = f x : rmap fs x
?
I could imagine Haskell can optimize the first version better since it refers to the built in map function. But beside that, does Haskell struggle with the combinatory stuff in the first version? Or does it get optimized away?
Well, you can ask GHC what it does with the definitions. Let's compile module RMap where rmap :: [a -> b] -> a -> [b] rmap = flip $ map . flip id recmap :: [a -> b] -> a -> [b] recmap [] _ = [] recmap (f:fs) x = f x : recmap fs x and look at the generated Core (obtained via -ddump-simpl). The @ x_yz things are type annotations, otherwise Core is pretty close to Haskell. It takes a bit to get used to reading Core, but it's not that difficult (as long as the functions are short). First, without optimisations: ==================== Tidy Core ==================== Rec { RMap.recmap :: forall a_adg b_adh. [a_adg -> b_adh] -> a_adg -> [b_adh] GblId [Arity 2 NoCafRefs] RMap.recmap = \ (@ a_adA) (@ b_adB) (ds_dee :: [a_adA -> b_adB]) (ds1_def :: a_adA) -> case ds_dee of _ { [] -> GHC.Types.[] @ b_adB; : f_adk fs_adl -> GHC.Types.: @ b_adB (f_adk ds1_def) (RMap.recmap @ a_adA @ b_adB fs_adl ds1_def) } end Rec } RMap.rmap :: forall a_adi b_adj. [a_adi -> b_adj] -> a_adi -> [b_adj] GblId [] RMap.rmap = \ (@ a_adE) (@ b_adF) -> GHC.Base.$ @ (a_adE -> [a_adE -> b_adF] -> [b_adF]) @ ([a_adE -> b_adF] -> a_adE -> [b_adF]) (GHC.Base.flip @ a_adE @ [a_adE -> b_adF] @ [b_adF]) (GHC.Base.. @ ((a_adE -> b_adF) -> b_adF) @ ([a_adE -> b_adF] -> [b_adF]) @ a_adE (GHC.Base.map @ (a_adE -> b_adF) @ b_adF) (GHC.Base.flip @ (a_adE -> b_adF) @ a_adE @ b_adF (GHC.Base.id @ (a_adE -> b_adF)))) -- both are pretty exactly the code written in Haskell. The explicit recursion recmap is small and efficient, it looks at the list whether there are still functions left, if so, apply the first of those to the value and recur. No cruft here. The other one, flip $ map . flip id is, well it's small too, but it's horrible to look at (all those @s). Every time you invoke that function, you call flip twice, ($), map, (.) and id. If you use the function on short lists, those calls generate significant overhead, but for long lists that doesn't matter. Now with optimisations. It doesn't matter whether you use -O1 or -O2, both produce exactly the same Core for both functions. And, for recmap, they produce exactly the same Core as -O0 did. So, recmap is pretty good code, GHC doesn't know how to improve it. rmap on the other hand changed: RMap.rmap :: forall a_adi b_adj. [a_adi -> b_adj] -> a_adi -> [b_adj] GblId [Arity 2 NoCafRefs Str: DmdType SL] RMap.rmap = \ (@ a_adE) (@ b_adF) (x_aeq :: [a_adE -> b_adF]) (y_aer :: a_adE) -> GHC.Base.map @ (a_adE -> b_adF) @ b_adF (\ (y1_XeE :: a_adE -> b_adF) -> y1_XeE y_aer) x_aeq There, that looks much better than before (and for short lists, it performs much better, but for long lists, the difference should be negligible). flip, ($) and (.) have been inlined and eliminated, what we get is rmap fs x = map (\f -> f x) fs , which is really nice. In fact, it's even nicer than what we got for recmap, because the compiler knows map, there are rewrite rules, which can produce much better code when the function is used, for example, it's possible that the list of functions is completely eliminated and a use is rewritten to a direct loop instead of allocating a list cell for each function in the list. And it can profit from the rule map f . map g = map (f . g) which is much harder to detect for the direct recursion.
And how "expensive" are pattern matches on the other side, anyway?
Depends on the pattern, of course. Matching "oh" against [] is much cheaper than matching "This is an expensive car" against 'T':'h':'i':'s':' ':'a':'n':' ':'e':'x':'p':'e':'n':'s':'i':'v':'e':' ':'p':'a':'t':'t':'e':'r':'n':' ':'m':'a':'t':'c':'h':_ But generally, pattern matching is quite cheap. The (well, one) advantage of higher order functions like map is that they're easier to optimise when they're used, given that somebody has taught the compiler how.
Thanks for reading, Bastian