
Hi, I am learning iteratees, and as a starter project I wanted to use expat- enumerator to parse a 2 gigabyte XML file. I expected to be able to do what SAX does in Java, i.e. to avoid loading the whole 2 gigabytes into memory. For warm-up, I wrote an iteratee to count lines in the file, and it does load the whole file into memory! After profiling, I see that the problem was Data.Enumerator.Text.utf8, it allocates up to 60 megabytes when run on a 40 megabyte test file. Any suggestions how to fix Text.utf8, or what people do for parsing UTF-8 encoded text files with iteratees? Thanks! Here is my code and profiling results: http://i.imgur.com/XEI1v.png http://hpaste.org/46037/counting_lines_with_iteratees http://hpaste.org/46038/counting_lines_with_iteratees