
Hi Edward, On 12/01/2011 07:55 AM, Edward Z. Yang wrote:
Hello Hugo,
Can you do a heap profile (+RTS -hT, or maybe use one of the other options if you've got a profiling copy lying around)?
I have attached a profiling session (showing types). I am surprised to see that the "[]" consumes so much data. Where is this coming from? Need to analyse this more closely.
Try using smaller data if it's taking too long; usually the profile will still look the same, unless it's a particular type of input that is triggering bad behavior.
The case above is for test data that is about 1/5 of the original data.
There is not enough detail in your code for me to use my psychic debugging skills, unfortunately.
I have very little knowledge of Haskell in order to interpret this stuff correctly, even so I think we still need your "psychic debugging skills" B-) Any idea how I can track what's generating all those "[]" ? Note that the (,,) seems to be the NGramTag. data which is basically used as a list (Zipper). regards, Hugo F.
Edward
Excerpts from Hugo Ferreira's message of Wed Nov 30 09:23:53 -0500 2011:
Hello,
On 11/29/2011 10:57 PM, Stephen Tetley wrote:
Hi Hugo
What is a POSTags and how big do you expect it to be?
type Token = String type Tag = String
type NGramTag = (Token, Tag, Tag)
type POSTags = Z.Zipper NGramTag
Generally I'd recommend you first try to calculate the size of your data rather than try to strictify things, see Johan Tibell's very useful posts:
http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.ht... http://blog.johantibell.com/2011/06/computing-size-of-hashmap.html
According to size in String I am expecting a maximum of 50 Mega. Profiling (after a painful 80 minutes) shows:
total alloc = 20,350,382,592 bytes
Way too much.
Once you know the size of your data - you can decide if it is too big to comfortably work with in memory. If it is too big you need to make sure you're are streaming[*] it rather than forcing it into memory.
If POSTags is large, I'd be very concerned about the top line of updateState - reversing lists (or sorting them) simply doesn't play well with streaming.
The zipper does quite a bit of reversing and appending. I also need to reverse lists to retain the order of the characters (text). I also do sorting but I have eliminated this in the tests.
So my question: how can one "force" the reversing and append? Anyone?
TIA, Hugo F.
[*] Even in a lazy language like Haskell, streaming data isn't necessarily automatic.
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners