
breakUp s | L.null s = [] | otherwise = h:(breakUp r) where (h,r) = L.splitAt 72 s
Running this on the 2G file blows up the stack pretty quickly, taking the first 1 million records (there are 20M of them) with a big stack parameter gives about 25% productivity, with GC taking the other 75%.
My understanding is that while this looks tail recursive, it isn't really because of laziness. I've tried throwing seq operators around, but they don't seem to do much to help the efficiency.
This function by itself doesn't really have any particular behaviour with respect to stack and heap usage, since it is just a linear mapping from one lazy sequence to another. To understand the stack blowout, you'll need to look at what you are doing with the result of this function. For example, a foldl over the result might be building a big thunk on the heap, which could blow the stack when evaluated*. If this is the case, you might avoid building the big thunk by using foldl', a version of foldl which evaluates as it folds. Invoking GHC with optimisation might also help, since the strictness evaluator can often do the equivalent of converting foldl to foldl' at the appropriate places. Have you looked at the Performance pages on the Wiki, in particular, the Strictness and Laziness pages? http://www.haskell.org/haskellwiki/Performance In my (limited) Haskell experience, I was continually being surprised by inexplicable stack blowouts until I spent a little time doing some focussed experiments, mainly involving folds over large lists. If you haven't done that, I would recommend it. * - Note that foldl is tail recursive, so clearly tail recursion doesn't necessarily result in a well-behaved loop in a lazy implementation!