profiling and strictness (was: Re: [Haskell-cafe] An interesting paper from Google)

Any time you see something "inexplicable" like lots of time being attributed to something simple like "get", it means that something isn't strict enough and "get" is having to force a bunch of lazy evaluations to do its job. Since you're using State.Strict but lift-ing to get there, I'd first look at the strictness of the monad you're lift-ing from. (I'm assuming State.Strict does what the label says, but it's possible that it's not strict in the way you need; strictness is kinda tricky.)
Ahh, thanks, this makes sense to me. The monads involved are an ErrorT and one other strict StateT. When the monads say they are strict, they seem to mean that >>= will force the evaluation of the monadic mechanics of the first argument when the second is demanded. By "monadic mechanics" I mean the work done by >>= itself, so for StateT it binds the result of the first argument with either 'let' or '~(a, s') <-' instead of 'case' or '(a, s') <-'. So in my understanding, a lazy State will build thunks on every >>= and they won't ever be forced until you pull the final value and pass it to a strict function in IO like print or an FFI call, and then they are forced recursively, at which point there is a stack overflow unless it's a short sequence. Talking about strictness seems complicated to me because it's not really about what is actually forced, it's about who is forced by who (i.e., when 'a' is forced then 'b' will be) and whether it's forced in sequence or nested. So a strict State is also not forced until the strict IO function, but it turns into a sequential series of thunks rather than a nested one. If this is accurate, why would anyone want to use the lazy State? Anyway, State doesn't provide the really critical bit of strictness, which is in the actual state being modified. So 'State.modify' will inevitably lead to stack overflow if you don't intersperse 'get's in there. What is needed is both $! on State.put (or define your own strict modify that puts $! on put), and strictness annotation of the fields of the record that is being modified (provided it's a record and not a scalar type). In any case, ErrorT has a strict >>= by necessity, both States have a strict >>=, and all modifys and puts (I think) are strict, as are all the record fields. In any case, a simple 'get' should only have to force the state to whnf, so it shouldn't matter if the fields are huge thunks or not, so really all that should matter is if the state is already in whnf or is a huge nest of thunks. Which a strict >>= should prevent, right? This is right next to a 'lookup' function which is called twice as many times and should be doing more work since it actually looks in a Map, but is credited with less cpu and alloc. A lift . get should turn into -- ErrorT a <- do a <- StateT (\s -> return (s, s)) -- 'get' of the outer StateT return (a, s) -- >>= belongs to inner StateT return (Right a) Anyway, it's sort of a mishmash of semi inlined functions since this is especially nested and hard to understand with transformers, but it looks like a lot of applications of (\a -> (a, s)) type functions, so (,) constructors and a 'Right' constructor, inside of 'case's to make sure this are evaluations and not allocations (and I remember from the STG paper that a constructor inside a case need not even be constructed). So I don't totally understand where the forcing is happening to make credit the 'get' with so much work. BTW, I have a theory about what the '0' might mean... perhaps if a function is inlined away, it still appears in the profile list, but with 0 entries.
Moral of the story: time is accounted to the function that forces evaluation of lazy thunks, not to the thunks themselves or the function that created the lazy thunks. (I think the latter is impossible without passing around a lot of expensive baggage, and in any case doesn't tell you anything useful; unexpected functions taking a lot of time, on the other hand, tells you right away that there's excessive laziness in the invocation somewhere and gives you a starting point to track it down.)
Indeed, this is a good point that I hadn't realized in quite that way before, but tracking it down is still the tricky part.

If this is accurate, why would anyone want to use the lazy State?
To answer my own question, if you want a monad stack to produce lazy output. E.g. if you want to lazily produce data but also have exceptions and state: ErrorT e (LazyWriterT w (LazyStateT s Identity)) AFAIK this is the only way to do it. They must all be lazy and the order of the monads is essential. Then the output can be read from 'w', and only when it is exhausted may you look at whether there was an exception or the final state, as both of those are strict in the monad. I'll be doing some experiments to see if the laziness introduces excessive leakiness and if so how that can be fixed. Or perhaps more laziness will cure leakiness...
participants (1)
-
Evan Laforge