
Am Montag 11 Mai 2009 16:13:36 schrieb Michael Mossey:
In S. Thompson's book, problem 9.13 asks us to define 'init' in terms of foldr.
Check the thread starting at http://www.haskell.org/pipermail/haskell-cafe/2005-April/009562.html That contains several interesting approaches, though I don't think any of those was lazy enough to deal with infinite lists.
I was baffled at first because I didn't see a natural way to do this. It would look something like
init xs = foldr f initialValue xs
Since *FoldInit> init [] *** Exception: Prelude.init: empty list initialValue has to be (error "Prelude.init: empty list") if you don't do any post- processing. But then init [1] = f 1 (error "...") must be [], so f can't inspect its second argument (or it would return _|_ instead of []), but then you can't make init [1] = f 1 (error "...") return [] and also init [1,2] = f 1 (f 2 (error "...")) return [1]. So some post-processing is necessary.
where f would cons on each character except the rightmost.
f <when passed rightmost char> b = [] f <when passed any other char a> b = a : b
How does f "know" when it is passed the first character? initialValue has to signal this somehow. On #haskell, one person suggested doing it with some post-processing:
init xs = snd $ foldr f (True,[]) xs where f _ (True,_) = (False,[]) f a (False,b) = (False,a:b)
That gives init [] = [], which is not correct. The starting value must be (True, error "..."). It doesn't work on infinite lists, and will produce a stack overflow on sufficiently long lists. The problem is that pattern matching is strict, so to determine which brach to take in f 1 (init [2 .. n]) we must know whether the first component of init [2 .. n] is True or False. So we must look at f 2 (init [3 .. n]), same problem there... Before *anything* of the overall result can be determined, the whole list has to be traversed. We can fix these issues by making it lazier: vinit = snd . foldr f (True, error "Prelude.init: empty list") where f x y = (False, if fst y then [] else x:snd y) Here, the first component of f's result is determined before looking at f's arguments, thus to determine which branch to take in the second component, all that has to be done is check if y comes from an application of f or the initial value. Works for infinite lists, errors on empty lists, specs fulfilled.
I had an idea. If the initial value is the entire list, then its length can function as the "signal" that we are dealing with the rightmost char. This requires no post-processing:
init xs = foldr f xs xs where f a b | length b == length xs = []
| otherwise = a:b
That gives init [] = [], doesn't work on infinite lists and is badly inefficient. To fix at least two of these issues, you must do some post-processing, but that would lead again to something like above.
These seem contrived. I wonder if there is a more natural solution that Thompson had in mind. Any comments?
-Mike