
I started with the obvious main = getContents >>= print . tokenise where tokenise maps its way down a list of characters. This is very simple, very pleasant, and worked like a charm. However, the language has an INCLUDE directive, so I'm going to have to call readFile or something in the middle of tokenising, so the main tokeniser loop can't be a pure String -> [Token] function any more.
What about
tokenise :: [String] -> ([Token],[FilePath]) main = print . fst =<< mfix process where process (tokens,paths) = do mainContents <- getContents includes <- mapM readFile paths return $ tokenise $ mainContents : includes
I guess, it would be useful to replace ([Token],[FilePath]) with Writer [FilePath] [Token]
Method 1A (pure list processing) main = getContents >>= print . doit 0 doit n ('\n':cs) = doit (n+1) cs doit n ( _ :cs) = doit n cs doit n [] = n
I think, you should use something like (doit $! n+1) cs here.
In *retrospect*, it is really obvious why this was necessary, but I must say that in *prospect* I wasn't expecting it.
In fact, I was expecting this to be an issue even for 1A. I suppose, GHC is smart enough to suppress lazyness in the first method.