
On 24 May 2004 12:03, Adrian Hey wrote:
On Thursday 20 May 2004 5:04 pm, Simon Marlow wrote:
And after eliminating some bogosity in the benchmark, I now get:
Insert Lookup Delete AVL 0.87 0.22 0.87 FiniteMap 1.07 0.17 1.11 DData.Map 1.08 0.16 1.02 DData.IntMap 0.65 0.16 0.61
So I declare that on random data, DData.Map has essentially the same performance as FiniteMap. IntMap is faster for insertion/deletion as expected. AVL is better than FiniteMap for insertion/deletion but slower for lookups.
Could you explain the bogosity you eliminated?
I did a full GC before running each part of the test to keep random GC effects to a minimum, and eliminated a couple of cases of badly optimised code in the test program.
The reason I ask is that those second results for lookup for AVL trees seem pretty reasonable to me, considering the AVL library is general purpose and not specialised for any kind of map. Currently the AVL implementation will have to read 3 heap records per tree node visited, vs. 2 for FiniteMap and DData.Map.
For the balanced tree implementations, I wouldn't expect to see any significant difference in lookup times (given the same degree of specialisation) and what differences there may be would most likely be due to differences in how well they managed to balance trees. (But I guess if random data is to be our benchmark, even simple unbalanced trees would probably look quite reasonable :-)
Of course, this isn't testing how well the balancing works. It is comparing raw performance of several balanced tree implementations, so measuring the overhead of balancing, if you like.
DData.IntMap is rather different from the other three, so I'm not sure exactly how well I would expect it to perform, but if you assume lookup time is proportional to the No. of heap records examined (seems to be reasonable comparing AVL and FiniteMap/DData.Map), then you'd expect a lookup time of about 0.08 for each if they were specialised for Int maps.
Yes, IntMap should get faster lookup. I looked at the code and managed to make a couple of improvements: adding a 'seq k $' at the beginning of lookup helps, and I fixed a small performance problem with the Word library, but interestingly these don't help the results. I still get 0.16s for the lookup test, so perhaps it is dominated by something else. Cheers, Simon