
Excerpts from Travis Erdman's message of Tue Apr 06 15:31:30 -0400 2010:
In my application, I have a collection of (only) about 200 objects. So from the surface, not large at all. However, each object in the collection is a large tree that might be up to 20 MB in size. The application pulls objects one at a time from the collection, performs some processing, and delivers a new, updated object that needs to be inserted in the collection for future processing (the original object is now obsolete). I'm guessing a Data.IntMap might involve copying over about 8 objects for each single object update, which while better than copying all 200, is still not perfect.
The construction of Haskell tree types is such that the tree is automatically storing pointers to the actual object: it's only if you ask that the value be unboxed explicitly that it gets stored with the node. So you're only paying for a few pointer copies, not a huge 20MB copy. Have you profiled your application and found this to be the bottleneck? Cheers, Edward