
Juan Carlos Arevalo Baeza
Ok... I'm doing a lot of testing with monadic parsers. My current test consists on a Haskell program that parses a C++ source file and: [...] The problem is that my test program runs out of stack (not heap) on both GHC and Hugs when parsing a large-enough file (about 150K for GHC, a lot less for Hugs).
The classic cause of this problem in parsers is tracking line numbers. The problem is that line numbers have to be built but rarely get observed. This means that a line number like 33 will be represented by a thunk like this: 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 This uses a lot of heap space (though it may not be too bad because of sharing) but, more seriously, it takes stack space proportional to the line number - hence the stack overflows you've been seeing. The fix is to add strictness annotations to force evaluation of the line number every time you increment it. The easiest way to do this is to represent source locations by a data structure like this: data SrcLoc = MkSrcLoc FileName !Int !Int and use "seq" (or "$!") in the function that gets the next token. For example, if the function to get the next token looks like this: token :: Parser a token = Parser (\ (input, srcloc) -> case input of (t : new_input) -> return (a, (new_input, inc srcloc)) ... ) you would change the 4th line to: (t : new_input) -> let srcloc' = inc srcloc in srcloc' `seq` return (a, (new_input, srcloc')) [A similar problem can happen in pretty-printers that track column numbers.] -- Alastair Reid reid@cs.utah.edu http://www.cs.utah.edu/~reid/ ps You might want to have a look at this Prelude function: unlines :: [String] -> String