
On 18/06/2010 08:29, Evan Laforge wrote:
oh, that happens a lot. e.g. in f (g x), if f doesn't demand the value of its argument for a long time, then the heap closure for (g x) is lagging. If the value of (g x) is never demanded, but is nevertheless stored in some data structure for a while, then it is VOID.
Ok... so a (g x) closure is constructed, and it will keep 'x' alive when it shouldn't, but what's the actual data constructor here? If 'x' is [1, 2], will these two cons cells be counted as being in LAG stage? Or is it just the closure, which will show up as a sat_something?
I think it depends whether the [1,2] has been "used" or not. If it has been used, then the closures are in the USE state until the last use, otherwise they are in the LAG state.
The Monad instance for Writer looks like this:
instance (Monoid w) => Monad (Writer w) where return a = Writer (a, mempty) m>>= k = Writer $ case runWriter m of (a, w) -> case runWriter (k a) of (b, w') -> (b, w `mappend` w')
I expect sat_s0a1 is the closure for (w `mappend` w'). If that is causing your space leak, then maybe you need a Control.Monad.Writer.Stricter in which the written value is forced strictly by>>=.
So I copied Writer/Strict.hs and made a new one with a slightly stricter>>=, note the ! on w':
m>>= k = WriterT $ do (a, w)<- runWriterT m (b, !w')<- runWriterT (k a) let ws = w `mappend` w' return (b, ws)
Initially this had no effect. I put a ! on the first 'w', and now the StricterWriter.sat_* closure is replaced by the #2 consumer, which is PAP. What *is* PAP? Memory usage is exactly the same, I just have more PAP now. So I tried switching from DList and DList.append to [] and (++) and now... the lag is gone! Only the ! on w' is needed.
The quadratic behaviour of (++) is not a practical worry in my case, since I log rarely so most of the appends are with []s on one side or both, but this is a bit troubling. Writer + DList was specifically recommended as an efficient way to log things, but if a giant space leak is the result, then it's not such a great idea after all. I'm guessing it has something to do with how DList is implemented in terms of function compositions, but since I can't see inside PAP I have to guess.
Right, I wouldn't use DList for this. Here's an alternative I use: data AList a = ANil | ASing a | Append (AList a) (AList a) lenA :: AList a -> Int lenA ANil = 0 lenA (ASing _) = 1 lenA (Append l r) = lenA l + lenA r appendA ANil r = r appendA l ANil = l appendA l r = Append l r Note how appendA is strict(ish) and eliminates ANils, so in a writer monad it shouldn't build up a space leak. I'm sure you can write toList (don't use (++) though). I'd put the bang on ws: m>>= k = WriterT $ do (a, w) <- runWriterT m (b, w') <- runWriterT (k a) let !ws = w `mappend` w' return (b, ws) The problem with this monad is that >>= isn't tail-recursive, so it will cause stack overflows on recursive monadic functions. I suspect that a better alternative to the strict writer monad is the strict state monad in most cases, because its bind is tail-recursive.
To try to get a better picture of what was going on, I created a simpler profile with a simplified monad stack: just logging and state and doing some stylized actions... and now the results are the reverse! The stricter version on lists produces an enormous amount of lag, the stock Writer.Strict is much better, but still produces quite a bit of garbage (perhaps from the State, though I can't see what I'm doing wrong there). The dlist + normal strictness version is so fast the gc can't even measure it! But if I bump the 'depth' from 9 to 10 (which means factorially more calls), suddenly I'm getting stack overflows... but initially only in hb mode.
So the more I look at it, the less I understand. Clearly something is different from my production profile and the test profile, but it seems baffling that it makes such a difference as to reverse the results. I will try to reduce the production version element by element until the results switch around.
I put the simple version at http://hpaste.org/fastcgi/hpaste.fcgi/view?id=26329#a26329
This one displays much better performance with DList + Writer.Strict than List + StrictWriter so I guess it's not too surprising. However, *something* is still generating a lot of lag and -hb causes a stack trace, so I'm still doing it wrong somehow.
I also have a situation where -hb shows about 4mb of drag at the most, but when I run with '-hc -hbdrag', I only see 10k peaks. Shouldn't filtering the graph by drag add up to as much drag as when the graph isn't filtered?
That sounds suspicious. If you can make a self-contained example that demonstrates it and create a ticket, that would be a great help. Cheers, Simon
You see "*" when the type of a closure is polymorphic (as it would be in the case of w `mappend` w'). stg_ap_2_upd_info is a generic thunk implementation in the RTS - perhaps when profiling we should avoid using these to give you more information.
Well, it's eating a lot of space... so it must be something. I guess if it's impossible to tell what it is, then it's just being needlessly confusing there.
And thanks so much for the responses... space usage gets more confusing the more I look at it, but I hope it will come clear eventually given enough time.