
On 2018-06-29 15:14, Vlatko Basic wrote:
Indeed bang solves the issue. I didn't try it because the docs says value doesn't have to be forced for validateFunc (which is used for value), but obviously only to whnf.
I think the issue is something to do with the two default implementations for rnf in the NFData class. Historically, `rnf a = seq a ()` was the default implementation (ie just WHNF), but more recently there is a Generic-based version that should automatically reduce to normal form. I don't know why the Generic version is either 1. not used at all, or 2. not working properly, but I suspect lack of instance Generic (HashMap k v), or possibly instance Generic1/2 MapData (if they are things?), may have something to do with it. I don't know why there is no instance, but maybe it would allow breaking internal data structure invariants? Claude
Thanks. :-) Been wasting whole morning on this.
-------- Original Message -------- Subject: Re: [Haskell-cafe] Measuring memory usage From: Claude Heiland-Allen
To: haskell-cafe@haskell.org Date: 29/06/18 15:37 Hi Vlatko,
On 29/06/18 13:31, Vlatko Basic wrote:
Hello,
I've come to some strange results using Weigh package.
It shows that HashMap inside 'data' is using much, much more memory.
This seems to be astrictness issue - you may be measuring the size of a thunk instead of the resulting evaluated data.
To confirm that this is the case, you can replace:
data MapData k v = MapData (HashMap k v) deriving Generic
with
data MapData k v = MapData !(HashMap k v) deriving Generic
Or replace:
value "MapData" (MapData $ mkHMList full)
with
value "MapData" (MapData $! mkHMList full)
Either of these changes gave me results like this:
Case Allocated GCs HashMap 262,824 0 HashMap half 58,536 0 HashMap third 17,064 0 MapData 263,416 0
The real issue seems to be NFData not doing what you expect. I'm not sure what the generic NFData instance is supposed to do, as there is no instance Generic (HashMap k v), so maybe you need to write your own rnf if you don't like either of the above workarounds.
Claude
The strange thing is that I'm seeing too large mem usage in my app as well (several "MapData" like in records), and trying to figure out with 'weigh' what's keeping the mem.
Noticed that when I change the code to use HashMap directly (not inside 'data', that's the only change), the mem usage observed with top drops down for ~60M, from 850M to 790M.
These are the test results for 10K, 5K and 3.3K items for "data MapData k v = MapData (HashMap k v)" (at the end is the full runnable example.)
Case Allocated GCs HashMap 262,824 0 HashMap half 58,536 0 HashMap third 17,064 0 MapData 4,242,208 4
I tested by changing the order, disabling all but one etc., and the results were the same. Same 'weigh' behaviour with IntMap and Map.
So, if anyone knows and has some experience with such issues, my questions are:
1. Is 'weigh' package reliable/usable, at least to some extent? (the results do show diff between full, half and third)
2. How do you measure mem consumptions of your large data/records?
3. If the results are even approximately valid, what could cause such large discrepancies with 'data'?
4. Is there a way to see if some record has been freed from memory, GCed?
module Main where
import Prelude
import Control.DeepSeq (NFData) import Data.HashMap.Strict (HashMap, fromList) import GHC.Generics (Generic) import Weigh (mainWith, value)
data MapData k v = MapData (HashMap k v) deriving Generic instance (NFData k, NFData v) => NFData (MapData k v)
full, half, third :: Int full = 10000 half = 5000 third = 3333
main :: IO () main = mainWith $ do value "HashMap" ( mkHMList full) value "HashMap half" ( mkHMList half) value "HashMap third" ( mkHMList third) value "MapData" (MapData $ mkHMList full)
mkHMList :: Int -> HashMap Int String mkHMList n = fromList . zip [1..n] $ replicate n "some text"