
claus.reinke:
I have a standard Data.Map.Map as the base structure for one of my macid data tables (jobs), but I noticed something that is probably causing problems for me. Even a simple 20 million record with int/int key values causes an out of memory error for me in ghci, Int keys, Int values eh? Does using IntMap help? Interesting. Map Int Int, IntMap Int and [(Int, Int)] use pretty much the same amount of memory, assuming that no keys or values are shared:
I haven't followed the thread, but one thing that keeps tripping me up wrt memory use is that Maps aren't strict in their values by default.
Worse, if that is not what you want for your application, working around it can be rather awkward (Data.Map at least provides insertWith', but there's no unionWith', and Data.IntMap doesn't even have insertWith').
Ideally, I'd just like to indicate the strictness in the types (hint to ghc hackers: 'Data.Map.Map Int !Int' and '[!a]' would really be useful!-), but as that isn't supported, separate operations (with seq inserted manually at suitable places) seem necessary (hint to container library maintainers: please provide full set of strict operation variants!-).
If this is (or contributes to) the problem, it should show up in heap profiles as an upward ramp and could be fixed by defining and using locally augmented versions of the Map operations.
I'd like them strict and specialised, So that: data IntMap a = Nil | Tip {-# UNPACK #-} !Key a | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) applied as so, type T = IntMap {-# UNPACK #-} !Int would be equivalent to data IntMapT = Nil | Tip {-# UNPACK #-} !Key {-# UNPACK #-} !Int | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) where we've avoided an indirection in the Tip nodes. Less space, faster access. In general, being able to specialise polymorphic structures so they look like unpacked monomorphic ones would be awesome. (!Int, !Bool) -> (,) {-# UNPACK #-}!Int {-# UNPACK #-}!Bool -- Don