
On Friday 14 Oct 2005 3:17 pm, Ketil Malde wrote:
Hi all,
I have a program that uses hash tables to store word counts. It can use few, large hash tables, or many small ones. The problem is that it uses an inordinate amount of time in the latter case, and profiling/-sstderr shows it is GC that is causing it (accounting for up to 99% of the time(!))
Is there any reason to expect this behavior?
Heap profiling shows that each hash table seems to incur a memory overhead of approx 5K, but apart from that, I'm not able to find any leaks or unexpected space consumption.
Suggestions?
Well you could use a StringMap.. http://homepages.nildram.co.uk/~ahey/HLibs/Data.StringMap/ But that lib a bit lightweight so probably doesn't provide everyting you need at the moment. But it's something I mean to get back to when I have some time, so if there's anything in particular you want let me know and I'll give it some priority. You certainly should not need anything like 5k overhead per map, and you don't have to work via the IO monad either (though you can use an MVar StringMap or something if you like). Also, I seem to remember some thread about some problem with Data.HashTable implementation and space behaviour. Unfortunately I can't remember what the problem was and don't know if it's been fixed :-( Regards -- Adrian Hey