
On Thu, Jun 23, 2011 at 4:54 AM, Johan Tibell
On Thu, Jun 23, 2011 at 8:27 AM, Johan Tibell
wrote: Is 5 the optimal number of bits to slice off at a time (ie the best fanout)? It sounds like node copy cost on insert argues for a slightly narrower fanout. You'll be evacuating / scanning more words total, but new nodes may equate to less scanning overall (especially if this is running long enough to have some nodes get tenure).
I'm experimenting with this. 6 is far too much, making inserts 4-5x slower. 4 doesn't seem to improve things much (which is a bit counter-intuitive given that 6 made things so much work), but I need to experiment some more.
After some more testing I can improve the performance of insert by 30% by reducing the array size from 32 to 16. GC still dominates though.
Yes, I'd still expect that; internal node churn with fat nodes exhausts heap more quickly than usual. If large nodes become the norm, cranking up GC nursery size might be in order. It's great to see that fat nodes finally work well in GHC, though. When I first tried this in the 90's it was so bad (due to the extra level of indirection & write barrier problems) I gave up in frustration. -Jan
Johan