
On Fri, 18 May 2007 01:52:11 -0700, Jules Bean
A conversation on #haskell just showed that it's quite hard to explain (at least it is for me) to people attached to tail recursion why that is a red herring in Haskell.
If someone has the energy and the clarity to explain this clearly and make a wiki page about it, that would be great. If someone finds an existing wiki page I'm too unperceptive to notice, even better!
Here's an explanation that I put together a while ago for a friend (and forgive my pseudo-coding): ------------------------------------------------- I guess the thing with Haskell is that the complication is in the details and behavior of your program. It?s very easy to shoot yourself in the foot without even knowing. The basics of parsers are very easy to understand. For instance: hello = char ?h? >> char ?e? >> char ?l? >> char ?l? >> char ?o? The ?do? notation is just syntactic sugar for that, which sometimes makes those more readable: hello = do char ?h? char ?e? char ?l? char ?l? char ?o? Is a very simple parser that is easy to understand. I mean? any monad can be seen as just a way to enforce order of evaluation (so that in this case the letters are recognized in the proper order) and to hide some of the parameters to the functions (so that you don?t have to explicitly pass the input string in those parsers). That?s really all there is to it. Then there?s the concept of combinators: functions that take parsers as arguments, and generate a new parser. ?>>? is part of the monad definition, but it is also a combinator in its own right (take two parsers and generate a new parser which is the sequence of them). Also the alternative (call it ?<|>?) is a combinator: many1hellos = hello >> (many1hellos <|> return ()) ?return a? is also part of the definition of any monad. It is an operation that always succeeds and just returns the value, in this case the Haskell equivalent to ?void?. So if ?many1hellos? fails because the next ?hello? couldn?t be matched, then it calls it good anyway. That?s the backtracking you find in all these parsers. The ?<|>? combinator knows about the guts of the monad (about the internal state and the guts), of course, so it can do the very interesting things it does. It?s a primitive combinator. Something kinda like this (hope you?re already comfortable with the ?do? notation): a <|> b = do oldState <- getCurrentParserState resultFromA <- try a if failed resultFromA then do -- Try the other one setCurrentparserState oldState b else do -- oldState gets forgotten ? it will be garbage-collected return (extractResultValue resultFromA) Once you have enough of those primitives, you can build your own combinators, which work in exactly the same way. That?s where things start to go south, though, and the foot-shooting begins. Take for instance: many1 p = p >> (many1 p <|> return ()) Combinator that takes any parser (just like for instance ?hello?) and parses one or more of them (it exists in the standard library). That works. So far so good. But then maybe we want one that matches zero or more: many0 p = (p <|> return ()) >> (many0 p <|> return ()) That?s a silly example that is totally wrong, obviously (let me know if you don?t see it). This parser will never finish parsing. It?s an infinite loop. All grammar languages have rough corners like this one, of course, but with Haskell they can be very hard to track down. many0 p = (p >> (many0 p <|> return ())) <|> return () This one is also an infinite loop, although it is harder to figure it out. many0 p = (p >> (many1 p <|> return ())) <|> return () many0 p = many1 p <|> return () Those two are both identical and correct, the last one is just more concise. But their behavior is poor. I mean, if you try to parse a 1 GB file full of ?X?, using ?many0 (char ?X?)? the ?many1 p? part will take a long time to parse, and in the meantime the ?oldState? variable that I showed you above in the definition of ?<|>?, which is required in some form in order to do the backtracking, is fully alive (cannot be garbage-collected) even though after the first ?X? is parsed _we_ know it can never fail. That is a bad thing, but much worse is the behavior of ?many1?. In this case, the recursion is not good for tail-recursion, so you?ll end up running out of stack. Basically, you end up with a nested sequence of ?<|>? calls. That sort of problem is endemic to backtracking parsers, and all monadic parsers I know are backtracking parsers. The solution in this case is to never put anything expensive on the left side of a ?<|>? combinator. We can do that like so: many0 p = do result <- try p if failed result then return () else many0 p This way, there is no backtracking, and many1 can be efficiently defined as so: many1 p = p >> many0 p That is all somewhat similar to the issues you?d find in a recursive-descent parser written in any imperative language. But then again ? monads are very much like somewhat-restricted imperative programs. The point where it all gets really tricky is that you can find similar sorts of problems in non-monadic code, and those are really hard to find. For instance, take the following function from the standard library (I hope you?re somewhat acquainted with Haskell lists): replicate 0 v = [] replicate n v = v : replicate (n-1) v That?s a neat little function which builds a list by repeating the ?v? element ?n? times. Very simple. But it is not tail-recusive! The imperative way of thinking about it is that, after calling ?replicate (n-1) v?, it gets the result and appends it to ?v?. Clearly, that prevents the tail-recursion from happening. A fully tail-recursive version is this: replicate n v = replicateHelper n v [] replicateHelper 0 v result = result replicateHelper n v result = replicateHelper (n-1) (v : result) To the imperative eye, that looks much better. The problem here is that the tail-recursive version is less efficient in Haskell. This is because of the laziness inherent in the language. In the first version, the expression ?replicate (n-1) v? is a very compact thing. It?ll be evaluated some day, but until then it occupies very little space. But, in the tail-recursive version, the recursion _must_ happen all the way to the end before you can do anything with the list. As soon as you try to do anything with the list (like compare it with ?[]? to see if it?s empty), it will evaluate all the calls to ?replicateHelper?. It won?t evaluate all the ?:? operations, for instance, but still it will create an expression tree such as ?v : (v : (v : (v : ?)))? which is arbitrarily large. It has no choice, as the outermost ?v : ?? node of the expression tree is generated by the innermost call to ?replicateHelper?. So in the first case, the comparison ?replicate 1000000 ?X? == []? returns ?False? immediately. But in the second case it probably will blow due to lack of memory. The proper ?Haskell? way to read that function (or any other function) is? ?how quickly can you get to the outermost node of the actual expression generated by this function?? That will tell you how efficient it is, much better than anything else. ------------------------------------------------- The morale is... tail recursion is a strict code optimization only. Lazy code will do best to stay as far away from it as possible. Hope there aren't too many typos, and also hope that helps someone. JCAB