
On Dec 29, 2006, at 6:32 PM, Matthew Brecknell wrote:
breakUp s | L.null s = [] | otherwise = h:(breakUp r) where (h,r) = L.splitAt 72 s
Running this on the 2G file blows up the stack pretty quickly, taking the first 1 million records (there are 20M of them) with a big stack parameter gives about 25% productivity, with GC taking the other 75%.
My understanding is that while this looks tail recursive, it isn't really because of laziness. I've tried throwing seq operators around, but they don't seem to do much to help the efficiency.
This function by itself doesn't really have any particular behaviour with respect to stack and heap usage, since it is just a linear mapping from one lazy sequence to another. To understand the stack blowout, you'll need to look at what you are doing with the result of this function.
For example, a foldl over the result might be building a big thunk on the heap, which could blow the stack when evaluated*. If this is the case, you might avoid building the big thunk by using foldl', a version of foldl which evaluates as it folds.
I guess the consumer's really important (Didn't even occur to me, I was concentrating on how I was generating the list). I was trying to de-lazy the list, I did the following: bs = [...] recs' = (take 1000000) breakUp bs recs = foldr seq recs' recs' print $ length recs Would the fold be blowing the stack? Ranjan