
Am Samstag, den 17.09.2011, 23:37 +0200 schrieb Milan Straka:
Great job!
thanks for the positive feedback.
Could you please also generate comparison of and IntSet and DenseIntSet, when the set is sparse (eg [1, 65 .. N])? I would like to see the time complexities then. Also please include toList. If the results are fine, I currently see no reason why not to push the code in.
I have attached some benchmarking result.
Thank you very much. I am quite surprised by the slowdown in the intersection 100 case -- I would expect the times to be nearly identical for the DenseIntSet and IntSet, as seen in the union case. Any ideas?
While results are good for member, great for insert, intersection and union, toList is slower for sparse maps. toList is basically a foldr, so I think the culprit is this function:
foldrBits :: Int -> (Int -> a -> a) -> a -> Word -> a foldrBits shift f x = go shift where STRICT_1_OF_2(go) go bi 0 = x go bi n | n `testBit` 0 = f bi (go (succ bi) (n `shiftRL` 1)) | otherwise = go (succ bi) (n `shiftRL` 1)
I’ll try to optimize this function individually now, any suggestions are welcome.
We have to be sure there is no unneeded slowdown because of inefficient Core code -- that all numbers are unpacked, succ is unpacked to primop, a primop is used for the shift, and so on. Also, you can use some additional bit tricks -- for example the following heuristic, which skips last 6 bits if they are all zeroes:
go bi 0 = x {- heuristic -} go bi n | n .&. 0x3f == 0 = go (bi + 6) (n `shiftRL` 6) go bi n | n `testBit` 0 = f bi (go (succ bi) (n `shiftRL` 1)) | otherwise = go (succ bi) (n `shiftRL` 1)
Of course, the 6 used can be changed to some other number - some measurement should be made. I expected a square root of the number of bits to behave reasonably well and 6 seemed to be a good compromise for 32 and 64. Cheers, Milan