pattern match vs. pure function

Bastian Erdnüß

19 Oct 2010 19 Oct '10

11:23 p.m.

Suppose I would want to write a function rmap that acts like rmap [f,g,h] x == [f x, g x, h x] Do I gain any advantage or disadvantage (beside readability) from using rmap = flip $ map . flip id over rmap [] _ = [] rmap (f:fs) x = f x : rmap fs x ? I could imagine Haskell can optimize the first version better since it refers to the built in map function. But beside that, does Haskell struggle with the combinatory stuff in the first version? Or does it get optimized away? And how "expensive" are pattern matches on the other side, anyway? Thanks for reading, Bastian

Show replies by date

Daniel Fischer

20 Oct 20 Oct

12:23 a.m.

On Wednesday 20 October 2010 01:23:49, Bastian Erdnüß wrote:

...

Suppose I would want to write a function rmap that acts like

rmap [f,g,h] x == [f x, g x, h x]

Do I gain any advantage or disadvantage (beside readability) from using

rmap = flip $ map . flip id

over

rmap [] _ = [] rmap (f:fs) x = f x : rmap fs x

?

I could imagine Haskell can optimize the first version better since it refers to the built in map function. But beside that, does Haskell struggle with the combinatory stuff in the first version? Or does it get optimized away?

Well, you can ask GHC what it does with the definitions. Let's compile module RMap where rmap :: [a -> b] -> a -> [b] rmap = flip $ map . flip id recmap :: [a -> b] -> a -> [b] recmap [] _ = [] recmap (f:fs) x = f x : recmap fs x and look at the generated Core (obtained via -ddump-simpl). The @ x_yz things are type annotations, otherwise Core is pretty close to Haskell. It takes a bit to get used to reading Core, but it's not that difficult (as long as the functions are short). First, without optimisations: ==================== Tidy Core ==================== Rec { RMap.recmap :: forall a_adg b_adh. [a_adg -> b_adh] -> a_adg -> [b_adh] GblId [Arity 2 NoCafRefs] RMap.recmap = \ (@ a_adA) (@ b_adB) (ds_dee :: [a_adA -> b_adB]) (ds1_def :: a_adA) -> case ds_dee of _ { [] -> GHC.Types.[] @ b_adB; : f_adk fs_adl -> GHC.Types.: @ b_adB (f_adk ds1_def) (RMap.recmap @ a_adA @ b_adB fs_adl ds1_def) } end Rec } RMap.rmap :: forall a_adi b_adj. [a_adi -> b_adj] -> a_adi -> [b_adj] GblId [] RMap.rmap = \ (@ a_adE) (@ b_adF) -> GHC.Base.$ @ (a_adE -> [a_adE -> b_adF] -> [b_adF]) @ ([a_adE -> b_adF] -> a_adE -> [b_adF]) (GHC.Base.flip @ a_adE @ [a_adE -> b_adF] @ [b_adF]) (GHC.Base.. @ ((a_adE -> b_adF) -> b_adF) @ ([a_adE -> b_adF] -> [b_adF]) @ a_adE (GHC.Base.map @ (a_adE -> b_adF) @ b_adF) (GHC.Base.flip @ (a_adE -> b_adF) @ a_adE @ b_adF (GHC.Base.id @ (a_adE -> b_adF)))) -- both are pretty exactly the code written in Haskell. The explicit recursion recmap is small and efficient, it looks at the list whether there are still functions left, if so, apply the first of those to the value and recur. No cruft here. The other one, flip $ map . flip id is, well it's small too, but it's horrible to look at (all those @s). Every time you invoke that function, you call flip twice, ($), map, (.) and id. If you use the function on short lists, those calls generate significant overhead, but for long lists that doesn't matter. Now with optimisations. It doesn't matter whether you use -O1 or -O2, both produce exactly the same Core for both functions. And, for recmap, they produce exactly the same Core as -O0 did. So, recmap is pretty good code, GHC doesn't know how to improve it. rmap on the other hand changed: RMap.rmap :: forall a_adi b_adj. [a_adi -> b_adj] -> a_adi -> [b_adj] GblId [Arity 2 NoCafRefs Str: DmdType SL] RMap.rmap = \ (@ a_adE) (@ b_adF) (x_aeq :: [a_adE -> b_adF]) (y_aer :: a_adE) -> GHC.Base.map @ (a_adE -> b_adF) @ b_adF (\ (y1_XeE :: a_adE -> b_adF) -> y1_XeE y_aer) x_aeq There, that looks much better than before (and for short lists, it performs much better, but for long lists, the difference should be negligible). flip, ($) and (.) have been inlined and eliminated, what we get is rmap fs x = map (\f -> f x) fs , which is really nice. In fact, it's even nicer than what we got for recmap, because the compiler knows map, there are rewrite rules, which can produce much better code when the function is used, for example, it's possible that the list of functions is completely eliminated and a use is rewritten to a direct loop instead of allocating a list cell for each function in the list. And it can profit from the rule map f . map g = map (f . g) which is much harder to detect for the direct recursion.

...

And how "expensive" are pattern matches on the other side, anyway?

Depends on the pattern, of course. Matching "oh" against [] is much cheaper than matching "This is an expensive car" against 'T':'h':'i':'s':' ':'a':'n':' ':'e':'x':'p':'e':'n':'s':'i':'v':'e':' ':'p':'a':'t':'t':'e':'r':'n':' ':'m':'a':'t':'c':'h':_ But generally, pattern matching is quite cheap. The (well, one) advantage of higher order functions like map is that they're easier to optimise when they're used, given that somebody has taught the compiler how.

...

Thanks for reading, Bastian

Bastian Erdnüß

7:10 a.m.

On Oct 20, 2010, at 2:23, Daniel Fischer wrote:

...

On Wednesday 20 October 2010 01:23:49, Bastian Erdnüß wrote:

...
Suppose I would want to write a function rmap that acts like

rmap [f,g,h] x == [f x, g x, h x]

Do I gain any advantage or disadvantage (beside readability) from using

rmap = flip $ map . flip id

over

rmap [] _ = [] rmap (f:fs) x = f x : rmap fs x

?

I could imagine Haskell can optimize the first version better since it refers to the built in map function. But beside that, does Haskell struggle with the combinatory stuff in the first version? Or does it get optimized away?

Well, you can ask GHC what it does with the definitions. Let's compile

module RMap where

rmap :: [a -> b] -> a -> [b] rmap = flip $ map . flip id

recmap :: [a -> b] -> a -> [b] recmap [] _ = [] recmap (f:fs) x = f x : recmap fs x

and look at the generated Core (obtained via -ddump-simpl).

Cool. Didn't know that.

...

The @ x_yz things are type annotations, otherwise Core is pretty close to Haskell. It takes a bit to get used to reading Core, but it's not that difficult (as long as the functions are short).

I see. Well, still better then my hand writing ;-)

...

First, without optimisations:

[some good explinations]

Thanks!

...

Now with optimisations. [...]

There, that looks much better than before (and for short lists, it performs much better, but for long lists, the difference should be negligible). flip, ($) and (.) have been inlined and eliminated, what we get is

rmap fs x = map (\f -> f x) fs

, which is really nice. In fact, it's even nicer than what we got for recmap, because the compiler knows map, there are rewrite rules, which can produce much better code when the function is used, for example, it's possible that the list of functions is completely eliminated and a use is rewritten to a direct loop instead of allocating a list cell for each function in the list. And it can profit from the rule

map f . map g = map (f . g)

which is much harder to detect for the direct recursion.

Good news, seems like Haskell has no problem with combinators. As I hoped I could expect.

...

...
And how "expensive" are pattern matches on the other side, anyway?

Depends on the pattern, of course. Matching "oh" against [] is much cheaper than matching "This is an expensive car" against

'T':'h':'i':'s':' ':'a':'n':' ':'e':'x':'p':'e':'n':'s':'i':'v':'e':' ':'p':'a':'t':'t':'e':'r':'n':' ':'m':'a':'t':'c':'h':_

But generally, pattern matching is quite cheap. The (well, one) advantage of higher order functions like map is that they're easier to optimise when they're used, given that somebody has taught the compiler how.

I understand. Many thanks again for pointing out how I can test such things myself with -ddump-simple. Cheers, Bastian

Brent Yorgey

1:14 p.m.

On Wed, Oct 20, 2010 at 01:23:49AM +0200, Bastian Erdnüß wrote:

...

Suppose I would want to write a function rmap that acts like

rmap [f,g,h] x == [f x, g x, h x]

Do I gain any advantage or disadvantage (beside readability) from using

rmap = flip $ map . flip id

over

rmap [] _ = [] rmap (f:fs) x = f x : rmap fs x

?

By the way, I would write rmap as rmap fs x = map ($x) fs which is (in my opinion) many times more readable than flip $ map . flip id. In this case there's no particular advantage to a points-free definition. But using an existing recursive combinator (map) is a big win over writing out the recursion explicitly.

...

I could imagine Haskell can optimize the first version better since it refers to the built in map function. But beside that, does Haskell struggle with the combinatory stuff in the first version? Or does it get optimized away? And how "expensive" are pattern matches on the other side, anyway?

It's a fair question, and Daniel has already given a detailed answer. But honestly, I don't think this is the sort of thing you should really be worrying about. Write your programs in the most natural, elegant way you can think of, and trust that the compiler will try very hard to turn it into efficient code. If it later turns out to be too slow and profiling seems to indicate there's room for improvement, THEN you can start worrying about this sort of thing. -Brent

Patrick LeBoutillier

2:04 p.m.

Hi, On Wed, Oct 20, 2010 at 9:14 AM, Brent Yorgey wrote:

...

On Wed, Oct 20, 2010 at 01:23:49AM +0200, Bastian Erdnüß wrote:

...
Suppose I would want to write a function rmap that acts like

rmap [f,g,h] x == [f x, g x, h x]

Do I gain any advantage or disadvantage (beside readability) from using

rmap = flip $ map . flip id

A newbie question about this. I can figure out the type of "flip ($)" and I can work it from ghci: Prelude> :t flip flip :: (a -> b -> c) -> b -> a -> c Prelude> :t ($) ($) :: (a -> b) -> a -> b Prelude> :t flip ($) flip ($) :: b -> (b -> b1) -> b1 but I cannot understand how "flip id" is equivalent: Prelude> :t flip flip :: (a -> b -> c) -> b -> a -> c Prelude> :t id id :: a -> a Prelude> :t flip id flip id :: b -> (b -> c) -> c How does one get the type of "flip id" by applying "flip" to "id"? Thanks, Patrick

...

...
over

rmap [] _ = [] rmap (f:fs) x = f x : rmap fs x

?

By the way, I would write rmap as

rmap fs x = map ($x) fs

which is (in my opinion) many times more readable than flip $ map . flip id. In this case there's no particular advantage to a points-free definition. But using an existing recursive combinator (map) is a big win over writing out the recursion explicitly.

...
I could imagine Haskell can optimize the first version better since it refers to the built in map function. But beside that, does Haskell struggle with the combinatory stuff in the first version? Or does it get optimized away? And how "expensive" are pattern matches on the other side, anyway?

It's a fair question, and Daniel has already given a detailed answer. But honestly, I don't think this is the sort of thing you should really be worrying about. Write your programs in the most natural, elegant way you can think of, and trust that the compiler will try very hard to turn it into efficient code. If it later turns out to be too slow and profiling seems to indicate there's room for improvement, THEN you can start worrying about this sort of thing.

-Brent _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

-- ===================== Patrick LeBoutillier Rosemère, Québec, Canada

Daniel Fischer

2:28 p.m.

On Wednesday 20 October 2010 16:04:41, Patrick LeBoutillier wrote:

...

Prelude> :t flip flip :: (a -> b -> c) -> b -> a -> c Prelude> :t id id :: a -> a Prelude> :t flip id flip id :: b -> (b -> c) -> c

How does one get the type of "flip id" by applying "flip" to "id"?

flip :: (a -> b -> c) -> b -> a -> c id :: t -> t So for flip id, we must unify (t -> t) with (a -> b -> c), which is, fully parenthesized, (a -> (b -> c)). So t = a, as that's what appears left of the outermost (->). Also, t = (b -> c), as that's what appears right of the outermost (->). a = t = b -> c, so in flip id, id appears at the type (b -> c) -> (b -> c) and flip is used at the type ((b -> c) -> (b -> c)) -> b -> (b -> c) -> c The first is eaten by flip, leaving flip id :: b -> (b -> c) -> c

...

Thanks,

Patrick

HTH, Daniel

Patrick LeBoutillier

2:38 p.m.

Thanks Daniel, that exactly what I was looking for. Patrick On Wed, Oct 20, 2010 at 10:28 AM, Daniel Fischer wrote:

...

On Wednesday 20 October 2010 16:04:41, Patrick LeBoutillier wrote:

...
Prelude> :t flip flip :: (a -> b -> c) -> b -> a -> c Prelude> :t id id :: a -> a Prelude> :t flip id flip id :: b -> (b -> c) -> c

How does one get the type of "flip id" by applying "flip" to "id"?

flip :: (a -> b -> c) -> b -> a -> c

id :: t -> t

So for flip id, we must unify (t -> t) with (a -> b -> c), which is, fully parenthesized, (a -> (b -> c)).

So t = a, as that's what appears left of the outermost (->). Also, t = (b -> c), as that's what appears right of the outermost (->).

a = t = b -> c, so in

flip id,

id appears at the type (b -> c) -> (b -> c) and flip is used at the type

((b -> c) -> (b -> c)) -> b -> (b -> c) -> c

The first is eaten by flip, leaving

flip id :: b -> (b -> c) -> c

...
Thanks,

Patrick

HTH, Daniel

-- ===================== Patrick LeBoutillier Rosemère, Québec, Canada

Bastian Erdnüß

21 Oct 21 Oct

11:29 a.m.

On Oct 20, 2010, at 15:14, Brent Yorgey wrote:

...

But honestly, I don't think this is the sort of thing you should really be worrying about. Write your programs in the most natural, elegant way you can think of, and trust that the compiler will try very hard to turn it into efficient code. If it later turns out to be too slow and profiling seems to indicate there's room for improvement, THEN you can start worrying about this sort of thing.

I buy that. Cheers, Bastian

5385

Age (days ago)

5387

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

Bastian Erdnüß
Brent Yorgey
Daniel Fischer
Patrick LeBoutillier