
Am Montag, 2. März 2009 22:38 schrieb Phil:
Thanks again - one quick question about lazy pattern matching below!
On 01/03/2009 23:56, "Daniel Fischer"
wrote: No, it's not that strict. If it were, we wouldn't need the bang on newStockSum (but lots of applications needing some laziness would break).
The Monad instance in Control.Monad.State.Strict is
instance (Monad m) => Monad (StateT s m) where return a = StateT $ \s -> return (a, s) m >>= k = StateT $ \s -> do (a, s') <- runStateT m s runStateT (k a) s' fail str = StateT $ \_ -> fail str
(In the lazy instance, the second line of the >>= implementation is ~(a,s') <- runStateT m s)
The state will only be evaluated if "runStateT m" resp. "runStateT (k a)" require it. However, it is truly separated from the return value a, which is not the case in the lazy implementation. The state is an expression of past states in both implementations, the expression is just much more complicated for the lazy.
I think I get this - so what the lazy monad is doing is delaying the evaluation of the *pattern* (a,s') until it is absolutely required.
Yes, the lazy bind says give me anything, I'll look at it later (or not). Now it may be that 'k', the function 'm' is bound to, already inspects a, then the pair must be deconstructed, and unless it's a real pair with sufficiently defined first component, you get an error. Or it may be that 'runState(T) (k a)' inspects s', then it's analogous. But if neither really cares, it'll just remain whatever it is until it's needed or thrown away (if some point later in the chain - before anything demanded any evaluation - a 'put 1 >> return 2' appears, we don't need to look at it, so why bother?). A stupid example: ---------------------------------------------------------------------- module UhOh where import Control.Monad import Control.Monad.State.Lazy --import Control.Monad.State.Strict uhOh :: State s () uhOh = State $ \_ -> undefined uhOhT :: Monad m => StateT s m () uhOhT = StateT $ \_ -> return undefined uhOhT2 :: Monad m => StateT s m () uhOhT2 = StateT $ \_ -> undefined oy :: State s () oy = State $ \_ -> ((),undefined) oyT :: Monad m => StateT s m () oyT = StateT $ \_ -> return ((),undefined) hum :: State Int Int hum = do k <- get w <- uhOh put (k+2) return w return (k+1) humT :: Monad m => StateT Int m Int humT = do k <- get w <- uhOhT put (k+2) return w return (k+1) humT2 :: Monad m => StateT Int m Int humT2 = do k <- get w <- uhOhT2 put (k+2) return w return (k+1) whoa n = runState (replicateM_ n hum >> hum) 1 whoaT n = runStateT (replicateM_ n humT >> humT) 1 whoaT2 n = runStateT (replicateM_ n humT2 >> humT2) 1 yum :: State Int Int yum = do k <- get w <- oy put (k+2) return w return (k+1) yumT :: Monad m => StateT Int m Int yumT = do k <- get w <- oyT put (k+2) return w return (k+1) hoha n = runState (replicateM_ n yum >> yum) 1 hohaT n = runStateT (replicateM_ n yumT >> yumT) 1 oops m = runState m 1 ---------------------------------------------------------------------- What happens with whoa 10 hoha 10 oops (whoaT 10) oops (whoaT2 10) oops (hohaT 10) respectively when the Lazy or Strict library is imported? Answer first, then test whether you were right.
This means that each new (value,state) is just passed around as a thunk and not even evaluated to the point where a pair is constructed - it's just a blob, and could be anything as far as haskell is concerned.
Not quite anything, it must have the correct type, but whether it's _|_, (_|_,_|_), (a,_|_), (_|_,s) or (a,s) (where a and s denote non-_|_ elements of the respective types), the (>>=) doesn't care. Whether any evaluation occurs is up to (>>=)'s arguments.
It follows that each new state cannot evaluated even if we make newStockSum strict as (by adding a bang) because the state tuple newStockSum is wrapped in is completely unevaluated - so even if newStockSum is evaluated INSIDE this blob, haskell will still keep the whole chain.
Well, even with the bang, newStockSum will only be evaluated if somebody looks at what mc delivers. In the Strict case, (>>=) does, so newStockSum is evaluated at each step. In the Lazy case, (>>=) doesn't, replicateM_ doesn't, so newStockSum won't be evaluated inside the blob, if it were, it would force the evaluation of the previous pair and almost everything else, then there would have been no problem. What the bang does in the lazy case is to keep the thunk for the evaluation of the states a little smaller and simpler, so the evaluation is a bit faster and uses less memory, but not much (further strictness elsewhere helps, too, as you've investigated).
Only when we actually print the result is each state required and then each pair is constructed and incremented as described by my transformer. This means that every tuple is held as a blob in memory right until the end of the full simulation. Now with the strict version each time a new state tuple is created, to check that the result of running the state is at least of the form (thunk,thunk). It won't actually see much improvement just doing this because even though you're constructing pairs on-the-fly we are still treating each state in a lazy fashion. Thus right at the end we still have huge memory bloat, and although we will not do all our pair construction in one go we will still value each state after ALL states have been created - performance improvement is therefore marginal, and I'd expect memory usage to be more or less the same as (thunk,thunk) and thunk must take up the same memory.
Yes.
So, we stick a bang on the state. This forces each state to evaluated at simulation time. This allows the garbage collector to throw away previous states as the present state is no longer a composite of previous states AND each state has been constructed inside it's pair - giving it Normal form.
Weak head normal form, actually, but since the state is a Double, the two coincide.
Assuming that is corrected, I think I've cracked it.
One last question if we bang a variable i.e. !x = blah blah, can we assume that x will then ALWAYS be in Normal form or does it only evaluate to a given depth, giving us a stricter WHNF variable, but not necessarily absolutely valued?
Bang patterns and seq, as well as case x of { CTOR y -> ... } evaluate xto WHNF, if you want normal form, you must take sterner measures (e.g., (`using` rnf) from Control.Parallel.Strategies).