Accessing new heap costs more than old?

Hello folks, I wonder if anybody can shed any light on the strange behaviour I've been seeing with some of my benchmarks. The story is.. While measuring execution times of various AVL routines on random data I found that insertion was taking far longer than deletion (over twice as long). This is surprising because if anything deletion is the more complex operation. Anyway after much struggling and experimentation with different options/inlining etc I failed to fix this so tried the same tests with Data.Intmap and got similar unexpected results. But I've now found that the root cause seems to be a subtle difference in the two tests. For insertion the test was cummulative, so each new insertion was on the tree resulting from the previous insertion. But for deletion the deletion was always done on the same tree. If I modify the insertion test to work the same way as deletion then sure enough, insertion is faster than deletion (as I would expect). The same is true for Data.IntMap too. (The insertion speeds for the two modes differ by a factor of 2.8..2.9 for both Data.Tree.AVL and Data.IntMap). But I can't think of a plausible explanation for this. The overall heap burn rate should be about the same in each case, as should the overall amount of live (non-garbage) heap records. I thought maybe it might be a cache effect, but if anything I would expect caching to favour the cummulative mode (new allocations should displace old from the cache AFAIK). Also profiling shows that the cumulative case performs slightly fewer allocations (as I'd expect because it starts with an empty tree). Anyway, just thought I'd mention it in the hope that there might be something that can be done about it. The cummulative case seems like it would be more typical of real world code, so taking a factor of 3 or so performance hit is undesirable IMHO, but may well be unavoidable :-( Regards -- Adrian Hey

On Saturday 03 Sep 2005 6:14 pm, Adrian Hey wrote:
But I can't think of a plausible explanation for this. The overall heap burn rate should be about the same in each case, as should the overall amount of live (non-garbage) heap records.
Hmm.. A little more thought leads me to suspect that this is an artifact of generational garbage collection, which I guess is much faster if the youngest generation is entirely garbage. Is that reasonable? Regards -- Adrian Hey

On Sep 5, 2005, at 3:11 AM, Adrian Hey wrote:
On Saturday 03 Sep 2005 6:14 pm, Adrian Hey wrote:
But I can't think of a plausible explanation for this. The overall heap burn rate should be about the same in each case, as should the overall amount of live (non-garbage) heap records.
Hmm.. A little more thought leads me to suspect that this is an artifact of generational garbage collection, which I guess is much faster if the youngest generation is entirely garbage. Is that reasonable?
This agrees with my theory. Assume that your original tree with n elements ends up in the old generation used by the GC. It will only occasionally be traversed and copied. When you perform d deletions on that tree, you'll allocate O(d lg n) new nodes, but since you're retaining only the last tree, you will only have O(lg n) new nodes to be scanned and copied by the garbage collector. By contrast, if you make a sequence of i insertions to grow the tree by i elements, you'll again allocate O(i lg n) new nodes---but you'll retain a good many of them, at least Omega(i). When i > lg n, the GC clearly does more work for insertion than for deletion. If you'd made all the insertions to the original tree, you again retain only O(lg n) new nodes, and the relative speeds of insertion vs deletion come into play. -Jan-Willem Maessen
Regards -- Adrian Hey _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Monday 05 Sep 2005 5:42 pm, Jan-Willem Maessen wrote:
This agrees with my theory.
Thanks, I'm sure this must be the explanation. I guess the moral of the story is that for comparative benchmarks you really do need to make sure you're comparing like with like. Very small differences in test method can have a significant impact on running time it seems. Regards -- Adrian Hey
participants (2)
-
Adrian Hey
-
Jan-Willem Maessen