
On Thu, Jul 1, 2010 at 12:03 AM, Milan Straka
Hi,
Milan Straka wrote:
I am working on improving containers performance and have written a draft paper about it http://fox.ucw.cz/papers/containers/containers.pdf
I am a bit surprised about the types chosen for HashSet and HashMap (section 5, p.9). Usually, hash-based containers are optimized for the case that the hash-buckets remain small. For that case, it's hard to imagine that the given types wouldn't be slower than the more straightforward types:
data HashSet elem = HS (IntMap [elem]) data HashMap key val = HM (IntMap [(key, val)])
Others have suggested further improvements by using strict (unboxed?) tuples, a variant on lists that is strict and/or non-empty.
I agree that my choice was not good enough. Johan Tibell also commented on this.
After suggestions by others I am thinking about data Some elem = Single {-# UNPACK #-} !elem | More (Set elem) or data Some elem = Single {-# UNPACK #-} !elem | More {-# UNPACK #-} !elem (Some elem)
Unfortunately unpacking doesn't work for polymorphic fields (the new warning for ineffective unpack pragmas in GHC head should warn about this). Given this you might as well use a standard list. Johan