Re: [Haskell-cafe] help optimizing memory usage for a program

2 Mar 2009

      On Mar 2, 2009, at 19:13 , Manlio Perillo wrote:
...
Manlio Perillo ha scritto:
...
[...]
...
moreover, you may set up"growing factor". with a g.f. of
1.5, for example, memory will be collected once heap will become  
1.5x
larger than real memory usage after last GC. this effectively
guarantees that memory overhead will never be over this factor
Thanks.
This seems to be effective (but it also reduce performances).
3) With -F1 option
[...] I have to parse the whole data set, to check if memory usage  
is good.
Ok, done:
real	49m7.369s
 user	45m21.642s
 sys	0m18.893s
814 MB used
This is better memory usage, respect to:
real	7m17.853s
 user	3m38.506s
 sys	0m7.612s
1586 MB used
However, Kenneth Hoste reported (http://boegel.kejo.be/):
26 minutes, with 700 MB used.
Maybe he was using the latest GHC version.
I would also like to check how performances are with other  
functional languages.
The 26m/700MB I mentioned on my blog was on my ancient PowerBook G4  
(1.5GHz PowerPC G4, 1.25G).

I redid the same experiment on our iMac (Core2 Duo, 2.0 GHz, 3.0G),  
i.e.:
- read in all the data
- count the number of keys in the IntMap (which should be 17,770, i.e.  
the number of movies)
- compute the mean overall movie rating (which should be 3.6033)

That task was done in 3m42s, using just 632M of memory, using the  
following command:

./netflix ./training_set/ +RTS -A128M -s

The -A option makes sure GC isn't cleaning up stuff too frequently,  
while -s just reports some statistics.

The way in which I'm reading in the data is somewhat different from  
yours.
I construct the IntMap from the ground up, i.e. starting with an empty  
IntMap and using foldM,
together with a function readMovie with the following type:

readMovie :: IntMap (UArray Int Int, UArray Int Word8) -> FilePath ->  
IO (IntMap (UArray Int Int, UArray Int Word8))

In readMovie, I'm using the 'insert' function provided by the IntMap  
module, which justs insert a new key-value pair
in the existing IntMap.

Your approach is very different: you create 17,770 IntMaps with a  
single key/value pair in them,
and then use union to combine them all.

I profiled your approach on the same iMac (using the same +RTS options),
and it needed 4m33s to run, using 948M of memory.

I think my approach is turning out better because I'm:

- building up the IntMap using 'empty' and 'insert', instead of  
combining 17,770 'singleton' IntMaps
   (which probably results better GC behavior)
- using UArray instead of Urr (although I don't know if that actually  
makes a difference here)

I hope this helps you with figuring out what the bottleneck is on your  
side.
It took me several days to come up with this approach, with the help  
from various Haskellers at IRC,
so I'm surely no expert...

I've thrown my current code online at http://boegel.kejo.be/files/Netflix_read-and-parse_24-02-2009.hs 
  ,
let me know if it's helpful in any way...

Also, I was indeed using GHC 6.10.1, although I'm unsure to what  
extent that matter.

greetings,

Kenneth

-- 

Kenneth Hoste
Paris research group - ELIS - Ghent University, Belgium
email: kenneth.hoste@elis.ugent.be
website: http://www.elis.ugent.be/~kehoste
blog: http://boegel.kejo.be