
Hello Hugo, Can you do a heap profile (+RTS -hT, or maybe use one of the other options if you've got a profiling copy lying around)? Try using smaller data if it's taking too long; usually the profile will still look the same, unless it's a particular type of input that is triggering bad behavior. There is not enough detail in your code for me to use my psychic debugging skills, unfortunately. Edward Excerpts from Hugo Ferreira's message of Wed Nov 30 09:23:53 -0500 2011:
Hello,
On 11/29/2011 10:57 PM, Stephen Tetley wrote:
Hi Hugo
What is a POSTags and how big do you expect it to be?
type Token = String type Tag = String
type NGramTag = (Token, Tag, Tag)
type POSTags = Z.Zipper NGramTag
Generally I'd recommend you first try to calculate the size of your data rather than try to strictify things, see Johan Tibell's very useful posts:
http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.ht... http://blog.johantibell.com/2011/06/computing-size-of-hashmap.html
According to size in String I am expecting a maximum of 50 Mega. Profiling (after a painful 80 minutes) shows:
total alloc = 20,350,382,592 bytes
Way too much.
Once you know the size of your data - you can decide if it is too big to comfortably work with in memory. If it is too big you need to make sure you're are streaming[*] it rather than forcing it into memory.
If POSTags is large, I'd be very concerned about the top line of updateState - reversing lists (or sorting them) simply doesn't play well with streaming.
The zipper does quite a bit of reversing and appending. I also need to reverse lists to retain the order of the characters (text). I also do sorting but I have eliminated this in the tests.
So my question: how can one "force" the reversing and append? Anyone?
TIA, Hugo F.
[*] Even in a lazy language like Haskell, streaming data isn't necessarily automatic.
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners