
Hi Ryan, thanks for a nice and thorough explanation. I had trouble understanding the section of the tutorial as well. Maybe it would deserve to rewrite to something a bit simpler? Anyhow, I'd like to ask: Is there a reason for which pattern matching for single-constructor data types isn't lazy by default? For example, why do I have to write
weird ~(x,y) = (1,x) crazy = weird crazy
when the compiler knows that the type (x,y) has just the single constructor, and could automatically introduce the additional laziness without any risk? Petr
Hi nemo. I had a lot of trouble with that section of the tutorial as well, and I believe that once you get it, a lot of Haskell becomes a lot simpler.
The way I eventually figured it out is using this idealized execution model for Haskell: you just work by rewriting the left side of a function to its right side. The only question is figuring out *which* function to rewrite!
Here's a simpler example:
f 0 = 0 f x = x+1
g (x:xs) = error "urk" g [] = 2 const a b = a
ex1 = const (f 1) (g [2,3]) ex2 = f (const (g []) (g [1,2]))
Lets say you wanted to know the value of ex1; you just use rewriting
ex1 -- rewrite using ex1 = const (f 1) (g [2,3]) -- rewrite using const = f 1 -- rewrite using f (second pattern) = 1+1 -- rewrite using + = 2
But lets say we picked a different order to rewrite...
ex1 -- rewrite using ex1 = const (f 1) (g [2,3]) -- rewrite using g = const (f 1) (error "urk") -- rewrite using error computation stops
Of course this is bad, and it was obviously wrong to evaluate g first (because const is lazy). So one heuristic we always use is "rewrite the leftmost application first" which avoids this problem. But lets try that rule on ex2!
ex2 -- rewrite using ex2 = f (const (g []) (g [1,2])) -- rewrite using f = ?
Unfortunately, now we don't know which pattern to match! So we have to pick a different thing to rewrite. The next rule is that, if the thing you want to rewrite has a pattern match, look at the argument for the patterns being matched and rewrite them instead, using the same "leftmost first" rule:
f (const (g []) (g [1,2])) -- trying to pattern match f's first argument -- rewrite using const = f (g []) -- still pattern matching -- rewrite using g = f 2 -- now we can match -- rewrite using f (second pattern) = 2+1 -- rewrite using + = 3
So, back to the original question (I'm rewriting the arguments to "client" and "server" for clarity)
reqs = client init resps resps = server reqs client v (rsp:rsps) = v : client (next rsp) rsps server (rq:rqs) = process rq : server rqs
Lets say we are trying to figure out the value of "resps", to print all the responses to the screen:
resps -- rewrite using resps = server reqs -- pattern match in server -- rewrite reqs = server (client init resps) -- pattern match in server -- pattern match also in client -- rewrite using resps = server (client init (server reqs)) -- pattern match in server, client, then server -- rewrite using reqs = server (client init (server (client init resps)))
You see that we are in a loop now; we are stuck trying to pattern match and we will never make any progress!
The "lazy pattern" says "trust me, this pattern will match, you can call me on it later if it doesn't!"
reqs = client init resps resps = server reqs client v (rsp:rsps) = v : client (next rsp) rsps server (rq:rqs) = process rq : server rqs
resps -- rewrite resps = server reqs -- server wants to pattern match, rewrite reqs = server (client init resps) -- Now, the lazy pattern match says "this will match, wait until later" -- rewrite using client! = let (rsp:rsps) = resps in server (init : client (next rq) rqs) -- rewrite using server = let (rsp:rsps) = resps in process init : server (client (next rq) rqs)
We now have a list node, so we can print the first element and continue (which requires us to know the code for "process" and "next", but you get the idea, I hope!)
Now of course, you can lie to the pattern matcher:
next x = x + 1 init = 5
client init [] -- rewrite using client = let (rsp0:rsps0) = [] in init : client (next rsp0) rsps0 -- rewrite using init = let (rsp0:rsps0) = [] in 5 : client (next rsp0) rsps0
-- print 5 and continue to evaluate... let (rsp0:rsps0) = [] in client (next rsp0) rsps0 -- rewrite using client = let (rsp0:rsps0) = [] (rsp1:rsps1) = rsps0 in (next rsp0) : client (next rsp1) rsps1 -- rewrite using next = let (rsp0:rsps0) = [] (rsp1:rsps1) = rsps0 in rsp0+1 : client (next rsp1) rsps1 -- + wants to "pattern match" on its first argument -- rewrite using rsp0 computation stops, pattern match failure, (rsp0:rsps0) does not match []
For this reason, many people (myself included) consider it bad style to use lazy/irrefutable pattern matches on data types with more than one constructure. But they are very handy on types with a single constructor, like pairs:
weird ~(x,y) = (1,x) crazy = weird crazy
crazy -- rewrite crazy = weird crazy -- rewrite weird = let (x0,y0) = crazy in (1, x0) -- We really want to know what x0 is now. -- But it is the result of a pattern match inside a let; so we need to evaluate -- the right hand side of the binding to see if the patterns match. -- So, rewrite crazy to attempt to pattern match... = let (x0, y0) = weird crazy in (1, x0) -- Then, rewrite weird = let (x1, y1) = crazy (x0, y0) = (1, x1) in (1, x0) -- rewrite x0 = let (x1, y1) = crazy (x0, y0) = (1, x1) in (1, 1) -- garbage collect = (1,1)
I hope this helps!
-- ryan