
Sami Liedes wrote:
On Wed, Mar 24, 2010 at 11:23:05AM +0100, Heinrich Apfelmus wrote:
A quick fix would be to intersperse a function between sort and readRecords that imposes proper evaluation order: [...] While I expect the above to work, I hesitate to claim that it actually does. The reason is that I don't understand your readRecords function at a glance, unlike the processFile pipeline, it is not apparent how it works. Since finding and fixing space leaks is easier if your code is more obvious, I recommend to reformulate it as a function composition as well, for instance something like this: [...]
Thanks, that was enlightening! With your version of readRecords the program takes as much memory as with mine (it's still more than 10x what the C version takes, but perhaps that's the cost of using Haskell). But your version taught me a lot. Exactly the kind of feedback I needed :)
My pleasure. :) I should point out that my version of readRecords still needs to be composed with the strictify function, for the same reasons as yours. But it shouldn't be very difficult to bake the evaluation order into readRecords now if desired. Also note that my readRecord has introduced another potential space leak. But thanks to the clarified structure, it is easy to spot. Namely, the function prefix "Name: " &&& prefix "Version: " is problematic. Expanding the (&&&) combinator, this is equal to \xs -> (prefix "Name: " xs, prefix "Version: " xs) which means that the xs is used twice and thus potentially leaky. After all, the first component might evaluate xs in full, which will then leak around until the second component consumes it. This is exactly what will happen if you feed it a patological input file consisting of just a single but huge record along the lines of Version: xyz Name: haskell Foo: bar Foo: bar ... ... Foo: Bar If this turns out to be a major problem, you can simply rewrite the offending function to evaluate the components "in parallel". The partition function in the Haskell Prelude demonstrates how to do this. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com