
On Thu, Jul 21, 2011 at 1:22 AM, Olexander Kozlov
Greetings Haskell community, I stuck with a space leak problem in simple task. I have a function described with a list of pairs: type Fn a b = [(a, b)] mapping values from a to b. And I have a composition operation which accepts two functions and returns theirs composition: compose :: Eq b => Fn a b -> Fn b c -> Fn a c Finding composition of 1,000,000 functions takes about 240M with the following implementation: type Fn a b = [(a, b)] lkup :: Eq a => a -> Fn a b -> b lkup x ((a, b) : abs) = if x /= a then lkup x abs else b compose :: Eq b => Fn a b -> Fn b c -> Fn a c compose ((a, b) : abs) bcs = (a, lkup b bcs) : compose abs bcs compose [] _ = [] fldl :: (a -> b -> a) -> a -> [b] -> a fldl f z (x:xs) = fldl f (f z x) xs fldl _ z [] = z fna = [(1, 2), (2, 3), (3, 1)] fni = [(1, 1), (2, 2), (3, 3)] test = fldl compose fni (replicate 1000000 fna) main = print test I build with: ghc --make -prof -auto-all -caf-all -fforce-recomp -rtsopts Test.hs Profile results: Test.exe +RTS -K50M -p -RTS total time = 0.34 secs (17 ticks @ 20 ms) total alloc = 240,003,820 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc compose Main 58.8 80.0 lkup Main 35.3 0.0 test Main 5.9 11.7 fldl Main 0.0 8.3 After reading 'High-Performance Haskell' by Johan Tibell I tried to made my compose function strict: compose :: Eq b => Fn a b -> Fn b c -> Fn a c compose ((a, b) : abs) bcs = let c = lkup b bcs in c `seq` (a, c) : compose abs bcs compose [] _ = [] and memory consuption reduced down to 180M. Good achievement but still too much! Profile results: Test.exe +RTS -K50M -p -RTS total time = 0.24 secs (12 ticks @ 20 ms) total alloc = 180,003,820 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc compose Main 83.3 73.3 lkup Main 8.3 0.0 fldl Main 8.3 11.1 test Main 0.0 15.6 I tried to make fldl strict as well: fldl :: (a -> b -> a) -> a -> [b] -> a fldl f z (x:xs) = let z' = f z x in z' `seq` fldl f z' xs fldl _ z [] = z and end up with 160M of total allocations. Profile results: Test.exe +RTS -K32M -p -RTS total time = 0.30 secs (15 ticks @ 20 ms) total alloc = 160,003,820 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc compose Main 46.7 82.5 lkup Main 40.0 0.0 test Main 6.7 17.5 fldl Main 6.7 0.0 Please help me spot where space leak occur, I see no other places for space leaks. I'll appreciate any help.
The space leak is because your fldl is not as strict as is needed. This version of the program needs about 1 meg on my computer: {-# LANGUAGE BangPatterns #-} type Fn a b = [(a, b)] lkup :: Eq a => a -> Fn a b -> b lkup x ((a, b) : abs) = if x /= a then lkup x abs else b compose :: Eq b => Fn a b -> Fn b c -> Fn a c compose ((a, b) : abs) bcs = (a, lkup b bcs) : compose abs bcs compose [] _ = [] -- fldl :: (a -> b -> a) -> a -> [b] -> a fldl f z (x:xs) = let z'@[(!z1,!z2),(!z3,!z4),(!z5,!z6)] = f z x in z' `seq` fldl f z' xs fldl _ z [] = z fna = [(1, 2), (2, 3), (3, 1)] fni = [(1, 1), (2, 2), (3, 3)] test = fldl compose fni (replicate 1000000 fna) main = print test $ Test +RTS -K50M -p -s -RTS [(1,2),(2,3),(3,1)] 351,125,156 bytes allocated in the heap 108,576 bytes copied during GC 26,888 bytes maximum residency (1 sample(s)) 18,168 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 669 collections, 0 parallel, 0.00s, 0.00s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed INIT time 0.00s ( 0.01s elapsed) MUT time 0.50s ( 0.50s elapsed) GC time 0.00s ( 0.00s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.50s ( 0.51s elapsed) %GC time 0.0% (0.5% elapsed) Alloc rate 703,371,486 bytes per MUT second Productivity 100.0% of total user, 98.4% of total elapsed When checking for a space leak, I find the output of +RTS -s -RTS more helpful than the allocation stats in the .prof: $ cat Test.prof Thu Jul 21 05:12 2011 Time and Allocation Profiling Report (Final) Test.exe +RTS -K50M -p -s -RTS total time = 0.48 secs (24 ticks @ 20 ms) total alloc = 228,003,836 bytes (excludes profiling overheads) COST CENTRE MODULE %time %alloc fldl Main 37.5 3.5 compose Main 29.2 84.2 lkup Main 25.0 0.0 test Main 8.3 12.3 If I went by just the .prof I might think the program took 228MB of ram while it was running, but it only needed about 1MB because it kept reusing that space. I hope that helps, Jason