[Haskell-cafe] Profiling makes memory leak go away? Is Haskell a practical language?

10 Apr 2007

      On Tue, 2007-04-10 at 12:14 +0200, apfelmus wrote:
...
Oren Ben-Kiki wrote:
...
The code is in http://hpaste.org/1314#a1 if anyone at all is willing
to take pity on me and explain what I'm doing wrong.
There is an important point to note about streaming, namely that it
conflicts with knowing whether the input is syntactically correct or
not.
True, this is  the core issue. Think of a turing machine processing an
infinite tape. It is writing output and producing a final result; it is
possible to examine the output tape before knowing the final result.
Haskell parsers insist on having the "output tape" live in memory until
the final result is known, and then give it to you as one huge object
when done (or discard it and just give you the error message). NOT a
good idea if your input tape is a 200 MB file!
...
You've probably noticed this already and introduced rToken for that
reason but I'm not sure.
The idea is that the parser generates a stream of tokens. If/when it
hits an error, you get an error token at the end and parsing stops.
...
...
Chasing down my memory leak I got into a weird situation where adding
a magic manual SCC section and compiling with -prof makes the leak
disappear.
That sounds fishy. Note that manual SCC annotations + optimization
currently is a bit buggy in GHC, in the sense that optimization disturbs
correct cost attribution.
Er... I just added '-prof'. No '-O' flag at all so no optimizations,
right?
...
Does the program eat the output of yes 'a' without memory explosion
showed by "top"? If so, it could perhaps be a problem of the
optimization phase which maybe expands
D.append parsed (case result of ...)
to
case result of {
      Nothing -> D.append parsed D.empty
      Just .. -> D.append parsed (D.singleton...) }
The latter won't stream because of the reason noted above.
Ah, but it *does* stream! This is the beauty of lazy evaluation.
The "free" code is basically:

 reply = parse ... -- Lazily evaluated
 tokens = rTokens reply -- Has some values "immediately"
 list = D.toList tokens -- Has some values "immediately"
 mapM_ list print -- Start printing "immediately"!

Where what "parse" does is:

 reply = Reply {
   rTokens = D.append concreteToken lazyTokens, -- "Immediate"
   rResult = lazyResult -- Only available at end of parsing
 }

So every time the parser calls "D.append" into the tokens, the printing
"D.toList" is able to extract, print and GC the token immediately. And
this works perfectly, with constant memory consumption. The problem
occurs when I peek at the final "rResult" field. The "leak" code says:

 reply = parse ... -- Lazily evaluated
 result = rResult reply -- Lazy; has value when parsing is done
 extra = case result ... -- Lazy; has value when parsing is done
 parsed = rTokens reply -- Has some values "immediately"
 tokens = D.append parsed extra -- Has some values "immediately"
 list = D.toList tokens -- Has some values "immediately"
 mapM_ list print -- Starts printing "immediately"!

This *still* starts printing tokens immediately. However, while in the
previous case the GC is smart enough to keep the program in constant
memory size, in the second case it for some reasons starts missing more
and more PAP objects so memory usage goes through the roof.
...
But it's
highly unlikely that GHC performs a transformation that changes
semantics this radically.
You'd think... but the fact of the matter is that while the first
version works fine, the second doesn't, UNLESS I add the magic SCC
section:

  extra = {-# SCC "magic" #-} case result ...

And compile with '-prof' (no '-O' flags). Then it somehow, finally, "get
the idea" and the program runs perfectly well with constant memory
consumption. Which, as you aptly put it, is very "fishy" indeed...
...
...
I can achieve the results I want in very short elegant code...
In my opinion, your streaming parser is not as elegant as it could be
which makes it hard to read the code.
Well... one point is the above basically establishes two "threads", one
printing tokens and one producing them. communicating through a sort of
message queue (rTokens). I think that's pretty elegant compared to the
hoops you need to jump through to do this in any other language :-)

Another point is that this is a dumbed-down toy example. In the real
program the parser does much more which justifies the awkwardness you
point out. Specifically there are nifty ways to handle decision points
(parsing isn't much good unless there are alternatives to be tested at
each point :-).
...
With monad transformers, we almost
have
Parser a ~= StateT State (WriterT (D.DList Token) Maybe) a
           ~= State -> (D.DList Token, Maybe (a,State))
Monad transformers are a bit beyond my grasp at this point, but from the
little I know about them I don't see how they would help me with the
GC/memory problem. They definitely might make the code even more
elegant...
...
Also, I'm really missing type signatures, especially for many. When
reading the code, I expected
many :: Parser a -> Parser [a]
but had to conclude
many :: Parser a -> Parser ()
by inspecting its code.
Sorry about that. Yes, all the "syntax" functions have this signature. I
added these as www.hpaste.org/1314#a2.

Is it possible I have hit on a bug in GHC's implementation, and adding
'-prof' somehow caused it to work around it?

Thanks,

       Oren Ben-Kiki