
Michael, I don't see how your code sample for (3) is any different to the
compiler than Roman's original sink2.
I also don't see how the original sink2 creates a bad bind tree. I presume
that the reason "fold" works is due to the streaming optimization rule, and
not due to its implementation, which looks almost identical to (3).
I worry about using fold in this case, which is only strict up to WHNF, and
therefore wouldn't necessarily force the integers in the tuples; instead it
would create tons of integer thunks, wouldn't it? Roman's hand-coded sink2
avoids this issue so I presume that's not what is causing his memory woes.
-- Dan Burton
On Wed, Aug 27, 2014 at 2:55 PM, Roman Cheplyaka
* Michael Snoyman
[2014-08-27 23:48:06+0300] The problem is the following Sink, which counts how many even/odd Tokens are seen:
type SinkState = (Integer, Integer)
sink2 :: (Monad m) => SinkState -> Sink Token m SinkState sink2 state@(!evenCount, !oddCount) = do maybeToken <- await case maybeToken of Nothing -> return state (Just Even) -> sink2 (evenCount + 1, oddCount ) (Just Odd ) -> sink2 (evenCount , oddCount + 1)
Wow, talk about timing! What you've run into here is expensive monadic bindings. As it turns out, this is exactly what my blog post from last week[1] covered. You have three options to fix this:
1. Just upgrade to conduit 1.2.0, which I released a few hours ago, and uses the codensity transform to avoid the problem. (I just tested your code; you get constant memory usage under conduit 1.2.0, seemingly without any code change necessary.)
Interesting. From looking at sink2, it seems that it produces a good, right-associated bind tree. Am I missing something?
And what occupies the memory in this case?
Roman
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe