possible memory leak in uvector 0.1.0.3

Hi. In the "help optimizing memory usage for a program" I discovered some interesting things: 1) Using lazy ByteStrings to read files it not a good choice, since the garbage collector is not able to proper garbage cleanup. Running with -F1 RTS flag, however, keeps memory usage down. 2) UArr from uvector leaks memory. I'm rather sure about this. In fact the memory usage of my program is 815 MB, when: - I use lazy ByteString with -F1 flag or - if I use strict ByteString Using UArray and strict ByteString, memory usage is 660 MB. In total I have 17770 arrays, so there is a leak of about 9146 bytes per array. In another program, where I do a lot of array concatenations, memory leak is much more evident. Even using -F1 flag, memory usage grows too much. Is this a know bug? How can I verify if it is really a memory leak bug? pumpkin_ (on #haskell.it) suggested me to use a more recent version of uvector package, from http://patch-tag.com/publicrepos/pumpkin-uvector Using this version memory usage is, finally, 643 MB! (and execution if a bit faster, too). The other program, with a lot of array concatenations, still eats a lot of memory... Thanks Manlio Perillo

manlio_perillo:
Hi.
In the "help optimizing memory usage for a program" I discovered some interesting things:
2) UArr from uvector leaks memory. I'm rather sure about this.
Note it was just allocating more than was required, it wasn't "leaking" it in any sense (i.e. losing track of the memory).
Using this version memory usage is, finally, 643 MB! (and execution if a bit faster, too).
Yep, known bug, and closed last month.
The other program, with a lot of array concatenations, still eats a lot of memory...
Concatenating arrays generally copies data. Which uses memory.

Don Stewart ha scritto:
manlio_perillo:
Hi.
In the "help optimizing memory usage for a program" I discovered some interesting things:
2) UArr from uvector leaks memory. I'm rather sure about this.
Note it was just allocating more than was required, it wasn't "leaking" it in any sense (i.e. losing track of the memory).
Using this version memory usage is, finally, 643 MB! (and execution if a bit faster, too).
Yep, known bug, and closed last month.
Ok. I should have checked the bug tracker.
The other program, with a lot of array concatenations, still eats a lot of memory...
Concatenating arrays generally copies data. Which uses memory.
Of course, but why the garbage collector does not "release" this temporary used memory? Note that I'm using the -F1 flag with the RST. Maybe it is a problem with IntMap, when there are a lot of keys? Thanks Manlio Perillo

Manlio Perillo ha scritto:
[...]
The other program, with a lot of array concatenations, still eats a lot of memory...
Concatenating arrays generally copies data. Which uses memory.
Of course, but why the garbage collector does not "release" this temporary used memory? Note that I'm using the -F1 flag with the RST.
Maybe it is a problem with IntMap, when there are a lot of keys?
It *is* a problem with IntMap. I have changed the program to not use any array concatenation, and it still requires a lot of memory. Does esist a data structure that is able to store something like 480189 keys with efficient memory usage? Thanks Manlio Perillo

Am Dienstag, 3. März 2009 11:10 schrieb Manlio Perillo:
Manlio Perillo ha scritto:
[...]
The other program, with a lot of array concatenations, still eats a lot of memory...
Concatenating arrays generally copies data. Which uses memory.
Of course, but why the garbage collector does not "release" this temporary used memory? Note that I'm using the -F1 flag with the RST.
Maybe it is a problem with IntMap, when there are a lot of keys?
It *is* a problem with IntMap. I have changed the program to not use any array concatenation, and it still requires a lot of memory.
Does esist a data structure that is able to store something like 480189 keys with efficient memory usage?
An array? Might be complicated to write fast code, but the memory overhead should be small.
Thanks Manlio Perillo

Peter Verswyvelen ha scritto:
Does esist a data structure that is able to store something like 480189 keys with efficient memory usage?
What is the range of your keys - do they use the full 32-bit of Int - and what type are the values?
From Netflix Prize documentation: CustomerIDs range from 1 to 2649429, with gaps. There are 480189 users. The values are UArr (Word32 :*: Word 8), from uvector package. Thanks Manlio

Hello Manlio, Tuesday, March 3, 2009, 1:10:48 PM, you wrote:
It *is* a problem with IntMap. I have changed the program to not use any array concatenation, and it still requires a lot of memory.
it may be problem with something else. in particular, check that you don't have lazy thunks stored instead of values
Does esist a data structure that is able to store something like 480189 keys with efficient memory usage?
data.hashtable. at least, i can calculate its memory usage - it should be less than 100 bytes per key even with collecting GC. plus memory required for values ...but intmap should be the same. i still think that you have unforced thunks stored in map -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Mar 3, 2009, at 11:10 , Manlio Perillo wrote:
Manlio Perillo ha scritto:
[...]
The other program, with a lot of array concatenations, still eats a lot of memory...
Concatenating arrays generally copies data. Which uses memory.
Of course, but why the garbage collector does not "release" this temporary used memory? Note that I'm using the -F1 flag with the RST. Maybe it is a problem with IntMap, when there are a lot of keys?
It *is* a problem with IntMap. I have changed the program to not use any array concatenation, and it still requires a lot of memory.
Does esist a data structure that is able to store something like 480189 keys with efficient memory usage?
I ran into the same problem when I first organized the IntMap to use user IDs as keys... The problem is the huge amount of keys, and the small UArrays as values. The overhead of both the IntMap and the UArray data types is just way too big with 480k different keys... I never looked into it thoroughly, but if you look at the definition of IntMap, each key causes several words of overhead, along with one word or so for each UArray. K. -- Kenneth Hoste Paris research group - ELIS - Ghent University, Belgium email: kenneth.hoste@elis.ugent.be website: http://www.elis.ugent.be/~kehoste blog: http://boegel.kejo.be

On Tue, 2009-03-03 at 02:12 +0100, Manlio Perillo wrote:
Hi.
In the "help optimizing memory usage for a program" I discovered some interesting things:
1) Using lazy ByteStrings to read files it not a good choice, since the garbage collector is not able to proper garbage cleanup.
Running with -F1 RTS flag, however, keeps memory usage down.
It is certainly possible to have proper garbage cleanup. I can write programs using lazy ByteStrings that process many megabytes of data and yet run in 1Mb of heap space. At first guess it sounds like you're holding onto too much, if not the whole stream perhaps bits within each chunk. Each chunk read from the file is 64k big and keeping any substring will force the whole chunk to be retained. If you're only keeping a fraction of each chunk you can use the ByteString.Lazy.copy function to make a deep copy and let the original 64k chunk get collected. Duncan

Duncan Coutts ha scritto:
On Tue, 2009-03-03 at 02:12 +0100, Manlio Perillo wrote:
Hi.
In the "help optimizing memory usage for a program" I discovered some interesting things:
1) Using lazy ByteStrings to read files it not a good choice, since the garbage collector is not able to proper garbage cleanup.
Running with -F1 RTS flag, however, keeps memory usage down.
It is certainly possible to have proper garbage cleanup. I can write programs using lazy ByteStrings that process many megabytes of data and yet run in 1Mb of heap space.
At first guess it sounds like you're holding onto too much, if not the whole stream perhaps bits within each chunk.
It is possible. I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr.
[...]
Thanks Manlio Perillo

At first guess it sounds like you're holding onto too much, if not the whole stream perhaps bits within each chunk.
It is possible.
I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr.
Just to make sure (code fragments or, better, reduced examples would make it easier to see what the discussion is about): are you forcing the UArr to be constructed before putting it into the Map? Claus

Claus Reinke ha scritto:
At first guess it sounds like you're holding onto too much, if not the whole stream perhaps bits within each chunk.
It is possible.
I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr.
Just to make sure (code fragments or, better, reduced examples would make it easier to see what the discussion is about): are you forcing the UArr to be constructed before putting it into the Map?
parse handle = contents <- S.hGetContents handle let v = map singleton' $ ratings contents let m = foldl1' (unionWith appendU) v v `seq` return $! m where -- Build a Map with a single movie rating singleton' :: (Word32, Word8) -> MovieRatings singleton' (id, rate) = singleton (fromIntegral $ id) (singletonU $ pairS (id, rate)) This function gets called over each file, with r <- mapM parse' [1..17770] let movieRatings = foldl1' (unionWith appendU) r The `ratings` function parse each line of the file, and return a tuple. For each line of the file I build an IntMap, then merge them together; The IntMaps, are then further merged in the main function. NOTE that the memory usage is the same if I remove array concatenation. There are 100,000,000 ratings, so I create 100,000,000 arrays containing only one element. However, memory usage is 1 GB just after 800 files. The data type is: type Rating = Word32 :*: Word8 type MovieRatings = IntMap (UArr Rating) -- UArr from uvector Code is here: http://haskell.mperillo.ath.cx/netflix-0.0.1.tar.gz but it is an old version (where I used lazy ByteString). Thanks Manlio Perillo

Am Dienstag, 3. März 2009 15:35 schrieb Manlio Perillo:
Claus Reinke ha scritto:
At first guess it sounds like you're holding onto too much, if not the whole stream perhaps bits within each chunk.
It is possible.
I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr.
Just to make sure (code fragments or, better, reduced examples would make it easier to see what the discussion is about): are you forcing the UArr to be constructed before putting it into the Map?
parse handle = contents <- S.hGetContents handle let v = map singleton' $ ratings contents let m = foldl1' (unionWith appendU) v v `seq` return $! m
The (v `seq` ) is completely useless. Maybe (size m) `seq` return m would help?
where -- Build a Map with a single movie rating singleton' :: (Word32, Word8) -> MovieRatings singleton' (id, rate) = singleton (fromIntegral $ id) (singletonU $ pairS (id, rate))

Hello Daniel, Tuesday, March 3, 2009, 5:47:36 PM, you wrote:
let v = map singleton' $ ratings contents let m = foldl1' (unionWith appendU) v v `seq` return $! m
The (v `seq` ) is completely useless. Maybe (size m) `seq` return m would help?
i suggest return $! length v return $! size m (if size really scans tree instead of using stored value :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Daniel Fischer ha scritto:
Am Dienstag, 3. März 2009 15:35 schrieb Manlio Perillo:
At first guess it sounds like you're holding onto too much, if not the whole stream perhaps bits within each chunk. It is possible.
I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr. Just to make sure (code fragments or, better, reduced examples would make it easier to see what the discussion is about): are you forcing the UArr to be constructed before putting it into the Map?
Claus Reinke ha scritto: parse handle = contents <- S.hGetContents handle let v = map singleton' $ ratings contents let m = foldl1' (unionWith appendU) v v `seq` return $! m
The (v `seq` ) is completely useless. Maybe (size m) `seq` return m would help?
In one of my tests I did: rnf v `seq` rnf m `seq` return m Memory usage was the same -- XXX these are missing from uvector package instance (NFData a, NFData b) => NFData (a :*: b) where -- NOTE: (:*:) is already strict rnf (a :*: b) = a `seq` b `seq` () instance NFData a => NFData (UArr a) where -- NOTE: UArr is already strict rnf array = array `seq` () Regards Manlio

Hello Manlio, Tuesday, March 3, 2009, 5:35:33 PM, you wrote:
There are 100,000,000 ratings, so I create 100,000,000 arrays containing only one element.
every array needs ~30 bytes - it's a minimal memory block ghc can alloc for variable-sized objects. multiple this by 3 to account for copying GC behavior -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin ha scritto:
Hello Manlio,
Tuesday, March 3, 2009, 5:35:33 PM, you wrote:
There are 100,000,000 ratings, so I create 100,000,000 arrays containing only one element.
every array needs ~30 bytes - it's a minimal memory block ghc can alloc for variable-sized objects. multiple this by 3 to account for copying GC behavior
Ok, this explains memory usage; thanks. IMHO, this informations should go in the wiki; they may be insignificant for normal applications, but when one starts to deal with huge amount of data, 10 bytes per item make an important difference. Manlio Perillo

I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr.
Just to make sure (code fragments or, better, reduced examples would make it easier to see what the discussion is about): are you forcing the UArr to be constructed before putting it into the Map?
parse handle = contents <- S.hGetContents handle let v = map singleton' $ ratings contents let m = foldl1' (unionWith appendU) v v `seq` return $! m
where -- Build a Map with a single movie rating singleton' :: (Word32, Word8) -> MovieRatings singleton' (id, rate) = singleton (fromIntegral $ id) (singletonU $ pairS (id, rate))
That helps to make things clearer, I think. One issue is the nature of Maps (strict in keys, non-strict in values). - neither singleton nor unionWith are strict in the Map values, so nothing here forces the evaluation of rate or construction of UArr Prelude Data.IntMap> (unionWith (++) (singleton 1 undefined) (singleton 2 undefined)) `seq` () () - singletonU is strict, but that only means that it will evaluate its parameter if it is evaluated itself (which it isn't, because singleton isn't strict) - seq on a list only forces the first node of the list ((:),[],_|_), so (v `seq`) isn't likely to help much. Also, you probably do not want to force the whole list of singletons before builing the Map, you want the singletons to be constructed and consumed incrementally. - forcing a Map doesn't force any of the values, nor does it force more than the top-level node of whatever the internal Map representation is, so (return $! m) isn't much help, either (by nature of unionWith and foldl1', it'll force all keys before it can say anything much about the Map, but the values remain untouched, burried further under unevaluated (++)s)
type Rating = Word32 :*: Word8 type MovieRatings = IntMap (UArr Rating) -- UArr from uvector
A standard trick to keep Map values evaluated by construction is to make the availability of keys dependent on their values, eg (singleton key) $! value. That won't help with unionWith and the appendUs, but it should allow the source string references to be dropped early, as the singletons are constructed. Hth, Claus

Claus Reinke ha scritto:
I split the string in lines, then map some functions on each line to parse the data, and finally calling toU, for converting to an UArr.
Just to make sure (code fragments or, better, reduced examples would make it easier to see what the discussion is about): are you forcing the UArr to be constructed before putting it into the Map?
parse handle = contents <- S.hGetContents handle let v = map singleton' $ ratings contents let m = foldl1' (unionWith appendU) v v `seq` return $! m
where -- Build a Map with a single movie rating singleton' :: (Word32, Word8) -> MovieRatings singleton' (id, rate) = singleton (fromIntegral $ id) (singletonU $ pairS (id, rate))
That helps to make things clearer, I think. One issue is the nature of Maps (strict in keys, non-strict in values).
- neither singleton nor unionWith are strict in the Map values, so nothing here forces the evaluation of rate or construction of UArr
But, as I have written, in one of my tests I also tried rnf to force evaluation: rnf v `seq` rnf m `seq` return m where: instance NFData a => NFData (Data.IntMap.IntMap a) where rnf = rnf . Data.IntMap.toList Isn't this sufficient?
[...] - seq on a list only forces the first node of the list ((:),[],_|_), so (v `seq`) isn't likely to help much. Also, you probably do not want to force the whole list of singletons before builing the Map, you want the singletons to be constructed and consumed incrementally.
Right, thanks.
type Rating = Word32 :*: Word8 type MovieRatings = IntMap (UArr Rating) -- UArr from uvector
A standard trick to keep Map values evaluated by construction is to make the availability of keys dependent on their values, eg (singleton key) $! value. That won't help with unionWith and the appendUs, but it should allow the source string references to be dropped early, as the singletons are constructed.
Tried; but, even using union instead of unionWith, the memory grows fast as before. Just to check if the culprit is IntMap, isn't possible to write a test program that build a big IntMap (UArr Int) ? Right now I don't have the time. Manlio

That helps to make things clearer, I think. One issue is the nature of Maps (strict in keys, non-strict in values).
- neither singleton nor unionWith are strict in the Map values, so nothing here forces the evaluation of rate or construction of UArr
But, as I have written, in one of my tests I also tried rnf to force evaluation: rnf v `seq` rnf m `seq` return m Isn't this sufficient?
It will force the Map that results from the repeated unioning, but does not ensure that this is done in an efficient way.
A standard trick to keep Map values evaluated by construction is to make the availability of keys dependent on their values, eg (singleton key) $! value. That won't help with unionWith and the appendUs, but it should allow the source string references to be dropped early, as the singletons are constructed.
Tried; but, even using union instead of unionWith, the memory grows fast as before.
Strange. I built myself a small wrapper to make your code fragment compilable, and just replacing (unionWith appendU) with (union) makes a drastic difference - as it should. It is rather annoying that Data.IntMap doesn't provide a strict form of unionWith or insertWith (Data.Map does at least provide insertWith'). But we can define our own, at the cost of an extra lookup. We can then foldl' that insertWith' directly over the ratings list, bypassing the non-strict parts of the Data.IntMap API (see code below). Claus (who still thinks that all Maps should be parameterized over their key-value pair type constructor, so that the default non-strict Maps would result from using non-strict pairs type IntMap = IntMapP (,) while the often desired element-strict Maps would result from using strict pairs, with no other change in API type IntMapStrict = IntMapP (:*:) ) ------------------------------------------------------- {-# LANGUAGE TypeOperators #-} import qualified Data.ByteString.Lazy as L import Data.Array.Vector import qualified Data.IntMap as IM import Data.List import Data.Word import qualified Data.ByteString.Lazy.Char8 as L8 import Data.Maybe import System.IO -- don't use this for real, test wrapper only ratings :: L.ByteString -> [(Word32,Word8)] ratings = map (\[i,r]->(fromIntegral $ fst $ fromJust $ L8.readInt i ,fromIntegral $ fst $ fromJust $ L8.readInt r)) . map L8.words . L8.lines parse handle = do contents <- L.hGetContents handle let v = map singleton' $ ratings contents let m = foldl' (\m (kw,v)->insertWith' appendU (fromIntegral kw,v) m) IM.empty v -- let m = foldl1' (IM.unionWith appendU) v -- let m = foldl1' (IM.union) v return $! m where -- Build a Map with a single movie rating singleton' :: (Word32, Word8) -> (Int,UArr Rating) singleton' (id, rate) = ((fromIntegral $ id), (singletonU $ pairS (id, rate))) -- (IM.singleton (fromIntegral $ id)) $ (singletonU $ pairS (id, rate)) insertWith' op (k,v) m = maybe (IM.insert k v m) (\old->((IM.insert k) $! (v `op` old)) m) (IM.lookup k m) type Rating = Word32 :*: Word8 type MovieRatings = IM.IntMap (UArr Rating) -- UArr from uvector -- more test wrapper, some trivial input data generate = withFile "in.data" WriteMode $ \h-> mapM_ (\(i,r)->hPutStrLn h $ show i++" "++show r) $ take 1000000 $ cycle [(i,i)|i<-[0..100]] main = withFile "in.data" ReadMode parse >>= print
participants (8)
-
Bulat Ziganshin
-
Claus Reinke
-
Daniel Fischer
-
Don Stewart
-
Duncan Coutts
-
Kenneth Hoste
-
Manlio Perillo
-
Peter Verswyvelen