Profiling introduces a space leak where there was none before?

In Ch 25 of Real World Haskell, the authors introduce some naive code for finding the average of a big list; it has a space leak, and they present several solutions. Below are two of the solutions that successfully eliminate the space leak (though, the first one -- the one that uses foldl'rnf -- is quite a bit faster). However, if compiled with profiling, the first one (using foldl'rnf) NOW has a leak. The second solution (foldl') does not have a leak even when profiling is enabled. I have used this foldl'rnf function in my own code, as it is the only solution I have found for a space leak in my own code. But, since it leaks when profiled, it is making analysis difficult. Is this a feature, bug, or user error? If a known issue, is there a workaround? The code and some documenting output follows. thanks, Travis ------------------------------------ {-# LANGUAGE BangPatterns #-} import System.Environment import Text.Printf import Control.Parallel.Strategies import Control.DeepSeq import Data.List (foldl') main = do [d] <- map read `fmap` getArgs printf "%f\n" (mean [1..d]) foldl'rnf :: NFData a => (a -> b -> a) -> a -> [b] -> a foldl'rnf f z xs = lgo z xs where lgo z [] = z lgo z (x:xs) = lgo z' xs where z' = f z x `using` rdeepseq -- first mean fn aka foldl'rnf mean :: [Double] -> Double mean xs = s / fromIntegral n where (n, s) = foldl'rnf k (0, 0) xs k (n, s) x = (n+1, s+x) :: (Int, Double) -- second mean fn aka foldl' -- mean :: [Double] -> Double -- mean xs = s / fromIntegral n -- where -- (n, s) = foldl' k (0, 0) xs -- k (!n, !s) x = (n+1, s+x) ------------------------------------------ [NO PROFILING, NO SPACE LEAK] C:\Documents and Settings\Travis\My Documents\Haskell Code>ghc --make temp5 -O2 -fasm [1 of 1] Compiling Main ( temp5.hs, temp5.o ) Linking temp5.exe ... C:\Documents and Settings\Travis\My Documents\Haskell Code>temp5 1e7 +RTS -sstderr temp5 1e7 +RTS -sstderr 5000000.5 1,170,230,652 bytes allocated in the heap 128,876 bytes copied during GC 3,372 bytes maximum residency (1 sample(s)) 13,012 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 2232 collections, 0 parallel, 0.05s, 0.05s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed INIT time 0.02s ( 0.03s elapsed) MUT time 1.52s ( 1.55s elapsed) GC time 0.05s ( 0.05s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.58s ( 1.63s elapsed) %GC time 3.0% (2.9% elapsed) Alloc rate 764,232,262 bytes per MUT second Productivity 96.0% of total user, 93.3% of total elapsed C:\Documents and Settings\Travis\My Documents\Haskell Code>temp5 1e8 +RTS -sstderr temp5 1e8 +RTS -sstderr 50000000.5 11,702,079,228 bytes allocated in the heap 1,253,872 bytes copied during GC 3,372 bytes maximum residency (1 sample(s)) 13,012 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 22321 collections, 0 parallel, 0.38s, 0.39s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed INIT time 0.02s ( 0.00s elapsed) MUT time 15.47s ( 15.72s elapsed) GC time 0.38s ( 0.39s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 15.86s ( 16.11s elapsed) %GC time 2.4% (2.4% elapsed) Alloc rate 755,734,682 bytes per MUT second Productivity 97.5% of total user, 96.0% of total elapsed [NOW TURN ON PROFILING, GET SPACE LEAK] C:\Documents and Settings\Travis\My Documents\Haskell Code>ghc --make temp5 -O2 -fasm -prof -auto-all [1 of 1] Compiling Main ( temp5.hs, temp5.o ) Linking temp5.exe ... C:\Documents and Settings\Travis\My Documents\Haskell Code>temp5 1e6 +RTS -sstderr -p -K128M temp5 1e6 +RTS -sstderr -p -K128M 500000.5 395,774,976 bytes allocated in the heap 238,684,620 bytes copied during GC 102,906,760 bytes maximum residency (7 sample(s)) 66,283,900 bytes maximum slop 179 MB total memory in use (4 MB lost due to fragmentation) Generation 0: 493 collections, 0 parallel, 4.83s, 4.84s elapsed Generation 1: 7 collections, 0 parallel, 0.23s, 0.30s elapsed INIT time 0.02s ( 0.03s elapsed) MUT time 0.81s ( 0.91s elapsed) GC time 5.06s ( 5.14s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 5.89s ( 6.08s elapsed) %GC time 85.9% (84.6% elapsed) Alloc rate 477,916,952 bytes per MUT second Productivity 13.8% of total user, 13.4% of total elapsed

I suggest that you send this message to the haskell-cafe@haskell.org mailing list. Not that it isn't welcome here, but answering it requires fairly specialised knowledge (I at least have no idea what the answer is) and you'll probably get better help there. -Brent On Thu, Aug 12, 2010 at 06:20:49PM -0700, Travis Erdman wrote:
In Ch 25 of Real World Haskell, the authors introduce some naive code for finding the average of a big list; it has a space leak, and they present several solutions.
Below are two of the solutions that successfully eliminate the space leak (though, the first one -- the one that uses foldl'rnf -- is quite a bit faster). However, if compiled with profiling, the first one (using foldl'rnf) NOW has a leak. The second solution (foldl') does not have a leak even when profiling is enabled.
I have used this foldl'rnf function in my own code, as it is the only solution I have found for a space leak in my own code. But, since it leaks when profiled, it is making analysis difficult.
Is this a feature, bug, or user error? If a known issue, is there a workaround? The code and some documenting output follows.
thanks,
Travis ------------------------------------
{-# LANGUAGE BangPatterns #-}
import System.Environment import Text.Printf import Control.Parallel.Strategies import Control.DeepSeq import Data.List (foldl')
main = do [d] <- map read `fmap` getArgs printf "%f\n" (mean [1..d])
foldl'rnf :: NFData a => (a -> b -> a) -> a -> [b] -> a foldl'rnf f z xs = lgo z xs where lgo z [] = z lgo z (x:xs) = lgo z' xs where z' = f z x `using` rdeepseq
-- first mean fn aka foldl'rnf mean :: [Double] -> Double mean xs = s / fromIntegral n where (n, s) = foldl'rnf k (0, 0) xs k (n, s) x = (n+1, s+x) :: (Int, Double)
-- second mean fn aka foldl' -- mean :: [Double] -> Double -- mean xs = s / fromIntegral n -- where -- (n, s) = foldl' k (0, 0) xs -- k (!n, !s) x = (n+1, s+x)
------------------------------------------
[NO PROFILING, NO SPACE LEAK]
C:\Documents and Settings\Travis\My Documents\Haskell Code>ghc --make temp5 -O2 -fasm [1 of 1] Compiling Main ( temp5.hs, temp5.o ) Linking temp5.exe ...
C:\Documents and Settings\Travis\My Documents\Haskell Code>temp5 1e7 +RTS -sstderr temp5 1e7 +RTS -sstderr 5000000.5 1,170,230,652 bytes allocated in the heap 128,876 bytes copied during GC 3,372 bytes maximum residency (1 sample(s)) 13,012 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 2232 collections, 0 parallel, 0.05s, 0.05s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.02s ( 0.03s elapsed) MUT time 1.52s ( 1.55s elapsed) GC time 0.05s ( 0.05s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.58s ( 1.63s elapsed)
%GC time 3.0% (2.9% elapsed)
Alloc rate 764,232,262 bytes per MUT second
Productivity 96.0% of total user, 93.3% of total elapsed
C:\Documents and Settings\Travis\My Documents\Haskell Code>temp5 1e8 +RTS -sstderr temp5 1e8 +RTS -sstderr 50000000.5 11,702,079,228 bytes allocated in the heap 1,253,872 bytes copied during GC 3,372 bytes maximum residency (1 sample(s)) 13,012 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 22321 collections, 0 parallel, 0.38s, 0.39s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.02s ( 0.00s elapsed) MUT time 15.47s ( 15.72s elapsed) GC time 0.38s ( 0.39s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 15.86s ( 16.11s elapsed)
%GC time 2.4% (2.4% elapsed)
Alloc rate 755,734,682 bytes per MUT second
Productivity 97.5% of total user, 96.0% of total elapsed
[NOW TURN ON PROFILING, GET SPACE LEAK]
C:\Documents and Settings\Travis\My Documents\Haskell Code>ghc --make temp5 -O2 -fasm -prof -auto-all [1 of 1] Compiling Main ( temp5.hs, temp5.o ) Linking temp5.exe ...
C:\Documents and Settings\Travis\My Documents\Haskell Code>temp5 1e6 +RTS -sstderr -p -K128M temp5 1e6 +RTS -sstderr -p -K128M 500000.5 395,774,976 bytes allocated in the heap 238,684,620 bytes copied during GC 102,906,760 bytes maximum residency (7 sample(s)) 66,283,900 bytes maximum slop 179 MB total memory in use (4 MB lost due to fragmentation)
Generation 0: 493 collections, 0 parallel, 4.83s, 4.84s elapsed Generation 1: 7 collections, 0 parallel, 0.23s, 0.30s elapsed
INIT time 0.02s ( 0.03s elapsed) MUT time 0.81s ( 0.91s elapsed) GC time 5.06s ( 5.14s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 5.89s ( 6.08s elapsed)
%GC time 85.9% (84.6% elapsed)
Alloc rate 477,916,952 bytes per MUT second
Productivity 13.8% of total user, 13.4% of total elapsed
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

On Friday 13 August 2010 03:20:49, Travis Erdman wrote:
In Ch 25 of Real World Haskell, the authors introduce some naive code for finding the average of a big list; it has a space leak, and they present several solutions.
Below are two of the solutions that successfully eliminate the space leak (though, the first one -- the one that uses foldl'rnf -- is quite a bit faster). However, if compiled with profiling, the first one (using foldl'rnf) NOW has a leak. The second solution (foldl') does not have a leak even when profiling is enabled.
I have used this foldl'rnf function in my own code, as it is the only solution I have found for a space leak in my own code. But, since it leaks when profiled, it is making analysis difficult.
Is this a feature, bug, or user error? If a known issue, is there a workaround? The code and some documenting output follows.
I must admit I don't really understand what's going on. However, compiling for profiling makes some optimisations impossible, so different behaviour between profiling and non-profiling code isn't too surprising. Since the profiling version overflows the default stack, it seems to be a problem of missing strictness. I believe, what happens is that profiling prevents too much inlining, so that the strictness analyser gets confused.
thanks,
Travis ------------------------------------
{-# LANGUAGE BangPatterns #-}
import System.Environment import Text.Printf import Control.Parallel.Strategies import Control.DeepSeq import Data.List (foldl')
main = do [d] <- map read `fmap` getArgs printf "%f\n" (mean [1..d])
foldl'rnf :: NFData a => (a -> b -> a) -> a -> [b] -> a foldl'rnf f z xs = lgo z xs where lgo z [] = z lgo z (x:xs) = lgo z' xs where z' = f z x `using` rdeepseq
You get better Core and a faster mean with lgo z (x:xs) = let z' = f z x in deepseq z' (lgo z' xs) as the second equation. Alas, that doesn't fix the profiling space-leak. For fixing the space leak, it is important whether the fold is defined in a library module and compiled separately or, as is the case here, it's defined in the Main module and not exported. There are several variants that fix the leak in the latter setting but not in the former. Since the former is the interesting case (in the latter case you can write faster specialised code), the version that fixes the profiling space leak as a separately compiled library function (at least, there's no leak here): noleak :: NFData a => (a -> b -> a) -> a -> [b] -> a noleak f = nol where nol !z [] = z nol z (x:xs) = case rdeepseq (f z x) of Done z' -> nol z' xs The important points are - manually inlining `using` in some way - the bang on z in the first equation (would probably also work with a bang in the second equation instead of the first) The above gives however a spurious deprecation warning (the warning code confuses the data constructor Done of data Eval with the deprecated type alias type Done = (), so it warns). To avoid that, you can also write the second equation as nol z (x:xs) = nol (runEval (rdeepseq (f z x))) xs Non-profiling performance is, as far as I can tell, identical to that of your foldl'rnf.
participants (3)
-
Brent Yorgey
-
Daniel Fischer
-
Travis Erdman