Can't find space leak in external sort with tournament trees (test case attached)

Hello haskell-cafe, I'm running into what I think is a space leak. I've modified external-sort-0.2 [0] to use tournament trees for merging, and although the block sorting works in constant space, merging seems to produce a space leak. The external sort works by lazily consuming an input list, chopping it up into fixed size blocks, sorting each block in memory and writing each sorted block out to disk. At this point we have a big, on-disk array of k sorted blocks. To produce the final, sorted output list, we lazily read back each block into a tournament tree (a heap of sorts), which should only compare the first element of block to figure out the tree root. The merging algorithm in full is in `tourMerge' [1]. I've confirmed by profiling (with -hc) that the merge step gobbles up memory. I've attached a self-contained program (two haskell source files). It requires binary, mersenne-random*, and ArrayRef from hackage. $ tar xvzf leak.tar.gz leak.hs Data/TournamentTree.hs $ ghc -O --make leak.hs $ ./leak 5000000 # generate 5 million entries in ./entries.tmp, sort them, and output to ./leak.out Writing 5000000 entries to './entries.tmp' Done. Henceforth, if you omit the number argument to leak it will sort './entries.tmp'. Sorting 5000000 entries from './entries.tmp' External block sort using 1000 blocks: -- 5000 entries per block -- 240000 bytes per block 1000--blocks sorted. Merging blocks and writing sorted entries to './leak.out' Depending on how much RAM you have, you may need to tweak the argument to leak and/or numBlocks (at the top of leak.hs) -- a block needs to be able to fit in RAM. If the argument to leak is too low, the sort will succeed and the leak won't be apparent. Otherwise, the OS will kill the process because it allocated too much. Anyone have an idea how I can fix the leak? Denis *Only required to generate the initial input file to sort; it's much faster than System.Random. [0] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/external-sort-0.2 [1] tourMerge is in leak.hs, but I've also inlined it here for convenience. TT is the Data.TournamentTree module, included in the tarball I've attached. tourMerge :: (Ord a) => [[a]] -> [a] tourMerge [] = [] tourMerge blks = let t = {-SCC "TMfromList" #-} TT.fromList blks in tM (TT.minElem t) (TT.deleteMinF t) where tM :: (Ord a) => [a] -> TT.Forest [a] -> [a] tM l f | null f = l | otherwise = let t = TT.tournament f next = TT.minElem t (x, xs) = {-# SCC "TTspan" #-} span (<= head next) l in x ++ (tM next (if null xs then TT.deleteMinF t else (TT.singleton xs : TT.deleteMinF t)))

Hello again,
Here are several heap profiles for sorting 1 million entries (each
entry is 48 bytes). Each was created by passing +RTS <flag> -L200 to
leak.hs, where <flag> is one of -hc, -hd, -hy, -hr; then compiled with
"ghc -O --make -prof -auto-all" under ghc 6.10.2, under OS X ppc.
* The cost centre profile (leak-hc.pdf) shows that
"get_a4hN/sort/performSort/main" is at fault for most of the space.
How do I map "get_a4hN" to a Data.Binary instance?
* I'm not sure how to interpret the retainer profiling in leak-hr.pdf
-- but it mentions my merging function (tourMerge) and tournamentRound
and deleteMinF from Data.TournamentTree. (leak-hr.prof was produced
during retainer profiling.)
Any suggestions on how to debug this further?
Denis
On Tue, Apr 21, 2009 at 21:12, Denis Bueno
Hello haskell-cafe,
I'm running into what I think is a space leak. I've modified external-sort-0.2 [0] to use tournament trees for merging, and although the block sorting works in constant space, merging seems to produce a space leak.
The external sort works by lazily consuming an input list, chopping it up into fixed size blocks, sorting each block in memory and writing each sorted block out to disk. At this point we have a big, on-disk array of k sorted blocks. To produce the final, sorted output list, we lazily read back each block into a tournament tree (a heap of sorts), which should only compare the first element of block to figure out the tree root. The merging algorithm in full is in `tourMerge' [1]. I've confirmed by profiling (with -hc) that the merge step gobbles up memory.
I've attached a self-contained program (two haskell source files). It requires binary, mersenne-random*, and ArrayRef from hackage.
$ tar xvzf leak.tar.gz leak.hs Data/TournamentTree.hs $ ghc -O --make leak.hs $ ./leak 5000000 # generate 5 million entries in ./entries.tmp, sort them, and output to ./leak.out Writing 5000000 entries to './entries.tmp' Done. Henceforth, if you omit the number argument to leak it will sort './entries.tmp'. Sorting 5000000 entries from './entries.tmp' External block sort using 1000 blocks: -- 5000 entries per block -- 240000 bytes per block 1000--blocks sorted. Merging blocks and writing sorted entries to './leak.out'
Depending on how much RAM you have, you may need to tweak the argument to leak and/or numBlocks (at the top of leak.hs) -- a block needs to be able to fit in RAM. If the argument to leak is too low, the sort will succeed and the leak won't be apparent. Otherwise, the OS will kill the process because it allocated too much.
Anyone have an idea how I can fix the leak? Denis
*Only required to generate the initial input file to sort; it's much faster than System.Random.
[0] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/external-sort-0.2
[1] tourMerge is in leak.hs, but I've also inlined it here for convenience. TT is the Data.TournamentTree module, included in the tarball I've attached.
tourMerge :: (Ord a) => [[a]] -> [a] tourMerge [] = [] tourMerge blks = let t = {-SCC "TMfromList" #-} TT.fromList blks in tM (TT.minElem t) (TT.deleteMinF t) where tM :: (Ord a) => [a] -> TT.Forest [a] -> [a] tM l f | null f = l | otherwise = let t = TT.tournament f next = TT.minElem t (x, xs) = {-# SCC "TTspan" #-} span (<= head next) l in x ++ (tM next (if null xs then TT.deleteMinF t else (TT.singleton xs : TT.deleteMinF t)))

Hi again,
The problem here turned out to be a too-lazy Data.Binary instance for
Entry. Using seq to force each number before wrapping it in an Entry
made those retainers go away.
I have yet another leaking problem, but I need to figure out which
question to ask of the mailing list first.
Denis
On Wed, Apr 22, 2009 at 07:38, Denis Bueno
Hello again,
Here are several heap profiles for sorting 1 million entries (each entry is 48 bytes). Each was created by passing +RTS <flag> -L200 to leak.hs, where <flag> is one of -hc, -hd, -hy, -hr; then compiled with "ghc -O --make -prof -auto-all" under ghc 6.10.2, under OS X ppc.
* The cost centre profile (leak-hc.pdf) shows that "get_a4hN/sort/performSort/main" is at fault for most of the space. How do I map "get_a4hN" to a Data.Binary instance? * I'm not sure how to interpret the retainer profiling in leak-hr.pdf -- but it mentions my merging function (tourMerge) and tournamentRound and deleteMinF from Data.TournamentTree. (leak-hr.prof was produced during retainer profiling.)
Any suggestions on how to debug this further? Denis
On Tue, Apr 21, 2009 at 21:12, Denis Bueno
wrote: Hello haskell-cafe,
I'm running into what I think is a space leak. I've modified external-sort-0.2 [0] to use tournament trees for merging, and although the block sorting works in constant space, merging seems to produce a space leak.
The external sort works by lazily consuming an input list, chopping it up into fixed size blocks, sorting each block in memory and writing each sorted block out to disk. At this point we have a big, on-disk array of k sorted blocks. To produce the final, sorted output list, we lazily read back each block into a tournament tree (a heap of sorts), which should only compare the first element of block to figure out the tree root. The merging algorithm in full is in `tourMerge' [1]. I've confirmed by profiling (with -hc) that the merge step gobbles up memory.
I've attached a self-contained program (two haskell source files). It requires binary, mersenne-random*, and ArrayRef from hackage.
$ tar xvzf leak.tar.gz leak.hs Data/TournamentTree.hs $ ghc -O --make leak.hs $ ./leak 5000000 # generate 5 million entries in ./entries.tmp, sort them, and output to ./leak.out Writing 5000000 entries to './entries.tmp' Done. Henceforth, if you omit the number argument to leak it will sort './entries.tmp'. Sorting 5000000 entries from './entries.tmp' External block sort using 1000 blocks: -- 5000 entries per block -- 240000 bytes per block 1000--blocks sorted. Merging blocks and writing sorted entries to './leak.out'
Depending on how much RAM you have, you may need to tweak the argument to leak and/or numBlocks (at the top of leak.hs) -- a block needs to be able to fit in RAM. If the argument to leak is too low, the sort will succeed and the leak won't be apparent. Otherwise, the OS will kill the process because it allocated too much.
Anyone have an idea how I can fix the leak? Denis
*Only required to generate the initial input file to sort; it's much faster than System.Random.
[0] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/external-sort-0.2
[1] tourMerge is in leak.hs, but I've also inlined it here for convenience. TT is the Data.TournamentTree module, included in the tarball I've attached.
tourMerge :: (Ord a) => [[a]] -> [a] tourMerge [] = [] tourMerge blks = let t = {-SCC "TMfromList" #-} TT.fromList blks in tM (TT.minElem t) (TT.deleteMinF t) where tM :: (Ord a) => [a] -> TT.Forest [a] -> [a] tM l f | null f = l | otherwise = let t = TT.tournament f next = TT.minElem t (x, xs) = {-# SCC "TTspan" #-} span (<= head next) l in x ++ (tM next (if null xs then TT.deleteMinF t else (TT.singleton xs : TT.deleteMinF t)))
participants (1)
-
Denis Bueno