
john lask wrote:
test1 = readFile "big.dat" >>= (\x->print $ parse x) test2 = readFile "big.dat" >>= (\x->print $ fst $ parse x)
test1 (on a large file) will succeed in ghc but fail in hugs test2 on same file will succeed in both ghc and hugs
big.dat is just some large data file say 1MB. (not particularly large by todays standards!)
The question: is there any changes that can be made to the code to make test1 work in hugs without changing the essence of the function?
parse x = sqnc item x where
item =( \ ts -> case ts of [] -> ( Nothing, []) ts -> ( Just (head ts), tail ts) )
sqnc p ts = let ( r, ts' ) = p ts in case r of Nothing -> ([],ts') Just x -> let (r',ts'') = (sqnc p ts') in ( x:r', ts'' )
Strange, this shouldn't happen :) You may want to try item [] = (Nothing, []) item (t:ts) = (Just t , ts) but that shouldn't help ;) Let's try to find out what's going on by doing graph reduction with our bare hands. The preliminary material on http://en.wikibooks.org/wiki/Haskell/Graph_reduction should help a bit. Ideally, there would be tool support (hat? other debugger?) but when things become too complicated, tools can only keep you a few minutes longer above the water before drowning in complexity, too. The main point is that (print $ parse x) and (print $ fst $ parse x) differ in that the latter only computes the answer but not the remaining tokens. So, the stack overflow is triggered when evaluating the remaining tokens, but I don't see why. What happens for (print $ snd $ parse x) ? Let's rewrite your code to figure out what's going on item [] = (Nothing, []) item ts = (Just (head ts), tail ts) For sqnc , we need to translate stuff like let (a,b) = e in . Let-bound patterns aren't explained in the wikibook and in fact they're tricky. When done wrong, there may be space leaks, see also J. Sparud. Fixing Some Space Leaks without a Garbage Collector. http://citeseer.ist.psu.edu/sparud93fixing.html I don't know whether its implemented in Hugs (probably not?) and GHC (probably, but maybe with bugs?). We'll use the not so good translatation let (a,b) = e in e' <=> let x = e; a = fst x; b = snd x; in e' I'd like to call sqnc differently, namely many . We get many p ts = let z = p ts r = fst z ts' = snd z in case r of Nothing -> ([], ts') Just x -> let z' = many p ts' r' = fst z' ts''= snd z' in (x:r', ts'') Intimidating, no? :) Now, let's evaluate an example expression, like many item (1:2:3:...) (the list is intended to be finite, but we'll decide later about its length). To preserve space and stay sane, we'll only focus on the things that get evaluated and write ... for the rest. Let's start: many item (1:2:3:...) => let ts = 1:2:3:... in let ... z = item ts; r = fst z; ... in case r of ... => let ... z = (Just (head ts), tail ts); r = fst z ... => let ... z = (r, tail ts); r = Just (head ts) ... in case r of The above step is not clear from the description in the wikibook, but it's a handy notation of saying that the first component and r point to the same thing. Expanding the case expression yields (in full form) => let ts = 1:2:3: ... in let z = (r, tail ts) r = Just x x = head ts ts'= snd z in let z' = many item ts' r' = fst z' ts''= snd z' in (x:r', ts'') This is the weak head normal form of our expression. Of course, we wanted print (many item ts) = putStrLn (show ...) which means evaluating the first component and then the second component in the pair to full normal form. So, the next redex to be reduced is x followed by r' which forces z' which at least forces ts' => ... => let ts = x:ts' x = 1 ts' = 2:3:... in let z = (r, ts') r = Just x in let z' = let ... in (..,..) r' = fst z' ts''= snd z' in (x:r',ts'') To stay sane, we garbage collect z and r and rename variables before expanding the expression for z' which is obtained in the same way we obtained it before let ts0 = x0 : ts1 x0 = 1 ts1 = 2:3:... z0 = let z = (r, tail ts1) r = Just x x = head ts1 ts'= snd z in let z' = many item ts' r' = fst z' ts''= snd z' in (x:r', ts'') r0 = fst z0 us0 = snd z0 in (x0:r0, us0) Collecting lets and renaming yields let ts0 = x0 : ts1 x0 = 1 ts1 = 2:3:... z = (r, tail ts1) r = Just x1 x1 = head ts1 ts' = snd z z1 = many item ts' r1 = fst z1 us1 = snd z1 z0 = (x1:r1, us1) r0 = fst z0 us0 = snd z0 in (x0:r0, us0) The insight is that the original naming was bad, r and z are quite different from r0 and z0. Reducing r0 and x1 yields => let ts0 = x0 : ts1 x0 = 1 ts1 = x1 : ts2 x1 = 2 ts2 = 3:... z = (r, tail ts1) r = Just x1 ts' = snd z z1 = many item ts' r1 = fst z1 us1 = snd z1 z0 = (r0, us1) r0 = x1:r1 us0 = snd z0 in (x0:r0, us0) The general scheme should be clear now: z,r and ts' are temporary variables and further reduction of r1, r2 and so on leads to a chain let x0 = 1; ts0 = x0 : ts1 x1 = 2; ts1 = x1 : ts2 x2 = 3; ts2 = x2 : ts3 ... x8 = .. z = (r, tail ts8) r = Just x8 ts' = snd z z8 = many item ts' r8 = fst z8 us8 = snd z8 z7 = (r7, us8) r7 = x8:r8 us7 = snd z7 ... z0 = (r0, us1) r0 = x1:r1 us0 = snd z0 in (x0:r0, us0) So, after forcing the first component of the overall result to normal form, the result looks like (1:2:3:..., snd (_,snd (_,snd (_,...))) ) and it seems that Hugs fails to evaluate the tail recursive chain of snd ?? In the end, here's our decisive result: either Hugs or my analysis has a bug :D Regards, apfelmus