
On Monday 06 April 2009 4:10:43 am Bulat Ziganshin wrote:
one way to solve this problem is to make one `modified` bit per each 256 elements rather than entire array so GC will have to scan only modified chunks
For reference, I constructed a benchmark that takes advantage of GHC's tagging of whole arrays to approximate this: http://hpaste.org/fastcgi/hpaste.fcgi/view?id=3316 Since each array has a dirty bit, making a type that stores an array of arrays that add up to the desired size is similar to having a dirty bit for chunks the size of the sub-array. The test then fills a 10 million element array. However, something about the benchmark makes it perform poorly for both small chunks and large chunks. -sstderr reports that lots of copying occurs for small chunk sizes, and I haven't bothered to figure out why this is the case. You can, however, see that marking dirty chunks in this fashion would be profitable. The un-chunked array takes around a minute here, while with chunks of 10,000 (which seems to be about the optimal value with the above copying tradeoff), it takes about 6 seconds, and that's still with 60+% GC time. -- Dan