Re: [Haskell-cafe] Conduit+GHC high memory use for simple Sink

28 Aug 2014

      On Thu, Aug 28, 2014 at 11:49 AM, Michael Snoyman 
wrote:
...
On Thu, Aug 28, 2014 at 11:37 AM, Simon Peyton Jones <
simonpj@microsoft.com> wrote:
...
GHC is keeping the entire representation of `lengthM` in memory
Do you mean that?  lengthM is a function; its representation is just code.
At the time I wrote it, I did. What I was seeing in the earlier profiling
was that a large number of conduit constructors were being kept in memory,
and I initially thought something similar was happening with lengthM. It
*does* in fact seem like the memory problems with this later example are
simply the list being kept in memory. And in fact, there's a far simpler
version of this that demonstrates the problem:
main :: IO ()
main = printLen >> printLen
printLen :: IO ()
printLen = lengthM 0 [1..40000000 :: Int] >>= print
lengthM :: Monad m => Int -> [a] -> m Int
lengthM cnt [] = return cnt
lengthM cnt (_:xs) =
    cnt' `seq` lengthM cnt' xs
  where
    cnt' = cnt + 1
I'll add that as a comment to #7206.
This still doesn't answer what's going on in the original code. I'm
concerned that the issue may be the same, but I'm not seeing anything in
the core yet that's jumping out at me as being the problem. I'll try to
look at the code again with fresher eyes later today.
Alright, I've opened up a GHC issue about this:

https://ghc.haskell.org/trac/ghc/ticket/9520

I'm going to continue trying to knock this down to a simpler test case, but
it seems that it's sufficient to call `action` twice to make the memory
usage high.

Michael