Stacking State on State.....

Hi, I¹ve been teaching myself Monads recently with some success, but I¹ve hit a snag when I tried to look at transformers. I¹m not sure if the problem is my understanding of transformer use at a theoretical level or if I¹m just getting the syntax wrong. If I stick a (hopefully) fairly simple example below, I was wondering if people could comment: Bit of background I have no problem using the State Monad and find it very useful for holding the state or say a homemade random number generator. I written several little bits of code like this: VanDerCorput :: Int-> ( Double, Int ) VanDerCorput state = ( output, state+1 ) where output = reflect state getVanDC :: State Int Double getVanFC = State VanDerCorput This is just holding an incremental state and each time it is evaluated the next number in the Van Der Corput sequence in generated this is just a quasi random sequence the implementation details of the sequence are irrelevant safe to say that the sequence is generated from N=1,2,3,4.....Infinity. The above example is so trivial it can of course be implemented easily in purely functional code by threading the state as a parameter to a pure function and then mapping that function to an ³iterate (+1) 1² statement, but that is not the point of the exercise! The next step of the program is to wrap the above example up in another function which takes the Van Der Corput sequence as input and provides output depending on*it¹s own* state. I¹m passing it to a Box-Muller transform. Very quickly the Box-Muller transform requires TWO inputs from VanDerCourput to produce TWO outputs itself, however we only ever need one at a time, so the Box-Muller transform itself must hold state saying weather it has already saved the 2nd output from a previous call, and thus doesn¹t need to call Van Der Corput this time to produce output it just returns its own state. (P.S. If you know what I¹m doing from a maths point of view please ignore the fact using a 1D Van Der Corput with Box Muller is a very bad idea I know this; I¹m keeping the example simple and Haskell orientated!) In C-like imperative code you¹d could do something like the below not that this is massively elegant, but it shows the case-in-point: Boxmuller() { // The ³outer² states of Box Muller static bool myState = False; static double storedNormal = 0.; // Local copy of current state bool currentState = myState; // State always flips each run myState = not myState; // If we don¹t have a stored value from a previous run if currentState == False { // Generate two new Van Der Corputs this would increment a state in getNextVanDerCorput() twice producing different output each time double rand1 = getNextVanDerCorput(); double rand2 = getNextVanDerCorput(); // Store one result for the NEXT run of Boxmuller() storedNormal = SOME_TRANSFORM(rand2); // Return the other return SOME_TRANSFORM(rand1); } // We have a leftover value from a previous run get a local copy double currentNormal = storedNormal; // Reset the stored value to zero storedNormal = 0.; // Return the local copy of the stored value return currentNormal; } getNextVanDerCorput() { // Starting state state int n = 1; int currentState = n; // Incremented each time we call the function ++n; // Value computed on the internal state of this function return SOME_OTHER_TRANSFORM(currentState); } Right, hopefully that explains explicitly what I¹m trying to do apologies for dropping into C, it¹s easier to explain in code than in words. It struck me that this could be done using a plain and simple State Monad in Haskell carrying ALL states for both functions around in a tuple. This is pretty ugly tho, and I figure both BoxMuller and VanDerCorput should have their own internal states so they can be used as building blocks in other functionality too so one big ugly Monad is bad code design, even if it would work for this specific example. Let¹s not go there. The VanDerCorput building block is just the Monad at the start of this post. The problem is HOW DO I WRAP ANOTHER INDEPENDENT STATE AROUND THIS? After some googling it looked like the answer may be Monad Transformers. Specifically we could add a StateT transform for our Box Muller state to our VanDerCorput State Monad. Google didn¹t yield a direct answer here so I¹m not even sure if my thinking is correct, people describe the process of using a transform as wrapping one monad in another¹ or threading one monad into another¹. What we want to do is have some internal state controlled by an independent outer state - this sounds about right to me? So I started playing around with the code, and got the below to compile. test :: StateT (Bool,Double) (State Int) Double test = do (isStored,normal) <- get let (retNorm,storeNorm) = if isStored then (normal,0) else (n1,n2) where n1 = 2 n2 = 3 put (not isStored, storeNorm) return retNorm Now this is incomplete and may be even wrong! I¹ll Explain my thinking: (Bool,Double) is equivalent to myState and storedNormal in the C example The last Double is the return value of the BoxMuller Monad The (State Int) is supposed to represent the VanDerCorput monad but the compiler (GHC 6.10) will only let me specify one parameter with it so I¹ve put the state and left the return type to the gods!!.... As I said this isn¹t quite right any ideas how to specify the type? The next few lines get and test the BoxMuller state, this seems to work OK to me, the problem is when I try to look at the STATE OF THE INTERNAL monad. n1 and n2 should evaluate and increment the state of VanDerCorput monad but I can¹t get anything to compile here. 2 and 3 are just dummy values to make the thing compile so I could debug. My last gripe is how to actually call this from a pure function do I need to use both evalStateT and evalState I can¹t see how to initialize both the inner and outer state ? OK I think that¹s more than enough typing, apologies for the war&peace sized post. Any help muchly muchly appreciated, Many Thanks, Phil.

Am Samstag, 28. Februar 2009 13:23 schrieb Phil:
Hi,
The problem is HOW DO I WRAP ANOTHER INDEPENDENT STATE AROUND THIS?
After some googling it looked like the answer may be Monad Transformers. Specifically we could add a StateT transform for our Box Muller state to our VanDerCorput State Monad. Google didn¹t yield a direct answer here so I¹m not even sure if my thinking is correct, people describe the process of using a transform as wrapping one monad in another¹ or threading one monad into another¹. What we want to do is have some internal state controlled by an independent outer state - this sounds about right to me?
If you absolutely don't want to have a state describing both, yes.
So I started playing around with the code, and got the below to compile.
test :: StateT (Bool,Double) (State Int) Double test = do (isStored,normal) <- get let (retNorm,storeNorm) = if isStored then (normal,0) else (n1,n2) where n1 = 2 n2 = 3 put (not isStored, storeNorm) return retNorm
Now this is incomplete and may be even wrong! I¹ll Explain my thinking:
(Bool,Double) is equivalent to myState and storedNormal in the C example The last Double is the return value of the BoxMuller Monad The (State Int) is supposed to represent the VanDerCorput monad but the compiler (GHC 6.10) will only let me specify one parameter with it so I¹ve put the state and left the return type to the gods!!.... As I said this isn¹t quite right any ideas how to specify the type?
You can't, the second argument to StateT must be a Monad, hence a type constructor you can pass an arbitrary type which then produces a new type from that. Fortunately, you don't need to. Say you have type VDCMonad = State Int nextVDC :: VDCMonad Double nextVDC = do n <- get put $! (n+1) return $ calculateVDCFromInt n Then you could have boxMullerVDC :: StateT (Maybe Double) VDCMonad Double boxMullerVDC = StateT $ \s -> case s of Just d -> return (d,Nothing) Nothing -> do d1 <- nextVDC d2 <- nextVDC let (b1,b2) = boxMullerTransform d1 d2 return (b1,Just b2) (I find a state of Maybe a more natural to indicate that *maybe* I have one a in store to use directly, than using (Bool,a)). However, I suspect that you would get better code if you abstracted over the sequence of pseudorandom Doubles and had simply calculation :: Sate [Double] whatever calculation = ??? result = evalState calculation bmVDC bmVDC = boxMuller $ map vanDerCorput [1 .. ] where boxMuller (k:n:more) = u:v:boxMuller more where (u,v) = bmTransform k n
The next few lines get and test the BoxMuller state, this seems to work OK to me, the problem is when I try to look at the STATE OF THE INTERNAL monad. n1 and n2 should evaluate and increment the state of VanDerCorput monad but I can¹t get anything to compile here. 2 and 3 are just dummy values to make the thing compile so I could debug.
My last gripe is how to actually call this from a pure function do I need to use both evalStateT and evalState I can¹t see how to initialize both the inner and outer state ?
result = evalState (evalStateT calculation Nothing) 1
OK I think that¹s more than enough typing, apologies for the war&peace sized post.
Any help muchly muchly appreciated,
Many Thanks,
Phil.
HTH, Daniel

Ok, so this question of stacking state on top of state has come up several
times lately. So I decided to whip up a small example. So here's a goofy
little example of an abstract representation of a computer that can compute
a value of type 'a'. The two states here are a value of type 'a', and a
stack of functions of type (a->a) which can be applied to that value.
Disclaimer: this code is only type-checked, not tested!
import Control.Monad.State
-- first, we'll rename the type, for convenience
type Programmable a = StateT [a->a] (State a)
-- add a function to the stack of functions that can be applied
-- notice that we just use the normal State functions when dealing
-- with the first type of state
add :: (a -> a) -> Programmable a ()
add f = modify (f:)
-- add a bunch of functions to the stack
-- this time, notice that Programmable a is just a monad
addAll :: [a -> a] -> Programmable a ()
addAll = mapM_ add
-- this applies a function directly to the stored state, bypassing the
function stack
-- notice that, to use State functions on the second type of state, we must
use
-- lift to get to that layer
modify' :: (a -> a) -> Programmable a ()
modify' f = lift (modify f)
-- pop one function off the stack and apply it
-- notice again the difference between modify' and modify. we use modify' to
modify the value
-- and modify to modify the function stack. This is again because of the
order in which we wrapped
-- the two states. If we were dealing with StateT a (State [a->a]), it would
be the opposite.
step :: Programmable a ()
step = do
fs <- get
let f = if (null fs) then id else (head fs)
modify' f
modify $ if (null fs) then id else (const (tail fs))
-- run the whole 'program'
runAll :: Programmable a ()
runAll = do
fs <- get
if (null fs) then (return ()) else (step >> runAll)
On Sat, Feb 28, 2009 at 8:31 AM, Daniel Fischer
Am Samstag, 28. Februar 2009 13:23 schrieb Phil:
Hi,
The problem is HOW DO I WRAP ANOTHER INDEPENDENT STATE AROUND THIS?
After some googling it looked like the answer may be Monad Transformers. Specifically we could add a StateT transform for our Box Muller state to our VanDerCorput State Monad. Google didn¹t yield a direct answer here so I¹m not even sure if my thinking is correct, people describe the process of using a transform as Œwrapping one monad in another¹ or Œthreading one monad into another¹. What we want to do is have some internal state controlled by an independent outer state - this sounds about right to me?
If you absolutely don't want to have a state describing both, yes.
So I started playing around with the code, and got the below to compile.
test :: StateT (Bool,Double) (State Int) Double test = do (isStored,normal) <- get let (retNorm,storeNorm) = if isStored then (normal,0) else (n1,n2) where n1 = 2 n2 = 3 put (not isStored, storeNorm) return retNorm
Now this is incomplete and may be even wrong! I¹ll Explain my thinking:
(Bool,Double) is equivalent to myState and storedNormal in the C example The last Double is the return value of the BoxMuller Monad The (State Int) is supposed to represent the VanDerCorput monad but the compiler (GHC 6.10) will only let me specify one parameter with it so I¹ve put the state and left the return type to the gods!!.... As I said this isn¹t quite right any ideas how to specify the type?
You can't, the second argument to StateT must be a Monad, hence a type constructor you can pass an arbitrary type which then produces a new type from that. Fortunately, you don't need to.
Say you have
type VDCMonad = State Int
nextVDC :: VDCMonad Double nextVDC = do n <- get put $! (n+1) return $ calculateVDCFromInt n
Then you could have
boxMullerVDC :: StateT (Maybe Double) VDCMonad Double boxMullerVDC = StateT $ \s -> case s of Just d -> return (d,Nothing) Nothing -> do d1 <- nextVDC d2 <- nextVDC let (b1,b2) = boxMullerTransform d1 d2 return (b1,Just b2)
(I find a state of Maybe a more natural to indicate that *maybe* I have one a in store to use directly, than using (Bool,a)).
However, I suspect that you would get better code if you abstracted over the sequence of pseudorandom Doubles and had simply
calculation :: Sate [Double] whatever calculation = ???
result = evalState calculation bmVDC
bmVDC = boxMuller $ map vanDerCorput [1 .. ] where boxMuller (k:n:more) = u:v:boxMuller more where (u,v) = bmTransform k n
The next few lines get and test the BoxMuller state, this seems to work
OK
to me, the problem is when I try to look at the STATE OF THE INTERNAL monad. n1 and n2 should evaluate and increment the state of VanDerCorput monad but I can¹t get anything to compile here. 2 and 3 are just dummy values to make the thing compile so I could debug.
My last gripe is how to actually call this from a pure function do I need to use both evalStateT and evalState I can¹t see how to initialize both the inner and outer state ?
result = evalState (evalStateT calculation Nothing) 1
OK I think that¹s more than enough typing, apologies for the war&peace sized post.
Any help muchly muchly appreciated,
Many Thanks,
Phil.
HTH, Daniel
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am Samstag, 28. Februar 2009 15:36 schrieb Andrew Wagner:
Ok, so this question of stacking state on top of state has come up several times lately. So I decided to whip up a small example. So here's a goofy little example of an abstract representation of a computer that can compute a value of type 'a'. The two states here are a value of type 'a', and a stack of functions of type (a->a) which can be applied to that value.
Nice.
Disclaimer: this code is only type-checked, not tested!
import Control.Monad.State
import Control.Moand (unless)
-- first, we'll rename the type, for convenience type Programmable a = StateT [a->a] (State a)
-- add a function to the stack of functions that can be applied -- notice that we just use the normal State functions when dealing -- with the first type of state add :: (a -> a) -> Programmable a () add f = modify (f:)
-- add a bunch of functions to the stack -- this time, notice that Programmable a is just a monad addAll :: [a -> a] -> Programmable a () addAll = mapM_ add
Be aware that this adds the functions in reverse order, an alternative is addAll = modify . (++) (addAll fs = modify (fs ++))
-- this applies a function directly to the stored state, bypassing the function stack -- notice that, to use State functions on the second type of state, we must use -- lift to get to that layer modify' :: (a -> a) -> Programmable a () modify' f = lift (modify f)
-- pop one function off the stack and apply it -- notice again the difference between modify' and modify. we use modify' to modify the value -- and modify to modify the function stack. This is again because of the order in which we wrapped -- the two states. If we were dealing with StateT a (State [a->a]), it would be the opposite. step :: Programmable a () step = do fs <- get let f = if (null fs) then id else (head fs) modify' f modify $ if (null fs) then id else (const (tail fs))
Last line could be modify (drop 1)
-- run the whole 'program' runAll :: Programmable a () runAll = do fs <- get if (null fs) then (return ()) else (step >> runAll)
runAll = do stop <- gets null unless stop (step >> runAll)

Thanks for helping clean up my dirty little hacking. This could
actually be made nicer by defining the following, and rewriting the
original code in terms of it:
type State2 a b = StateT a (State b)
type Programmable a = State2 a (a->a)
I'll leave the rewrite as an exercise for the reader, since I'm
standing in the store writing this on my iPhone :)
On Feb 28, 2009, at 10:08 AM, Daniel Fischer
Am Samstag, 28. Februar 2009 15:36 schrieb Andrew Wagner:
Ok, so this question of stacking state on top of state has come up several times lately. So I decided to whip up a small example. So here's a goofy little example of an abstract representation of a computer that can compute a value of type 'a'. The two states here are a value of type 'a', and a stack of functions of type (a->a) which can be applied to that value.
Nice.
Disclaimer: this code is only type-checked, not tested!
import Control.Monad.State
import Control.Moand (unless)
-- first, we'll rename the type, for convenience type Programmable a = StateT [a->a] (State a)
-- add a function to the stack of functions that can be applied -- notice that we just use the normal State functions when dealing -- with the first type of state add :: (a -> a) -> Programmable a () add f = modify (f:)
-- add a bunch of functions to the stack -- this time, notice that Programmable a is just a monad addAll :: [a -> a] -> Programmable a () addAll = mapM_ add
Be aware that this adds the functions in reverse order, an alternative is
addAll = modify . (++)
(addAll fs = modify (fs ++))
-- this applies a function directly to the stored state, bypassing the function stack -- notice that, to use State functions on the second type of state, we must use -- lift to get to that layer modify' :: (a -> a) -> Programmable a () modify' f = lift (modify f)
-- pop one function off the stack and apply it -- notice again the difference between modify' and modify. we use modify' to modify the value -- and modify to modify the function stack. This is again because of the order in which we wrapped -- the two states. If we were dealing with StateT a (State [a->a]), it would be the opposite. step :: Programmable a () step = do fs <- get let f = if (null fs) then id else (head fs) modify' f modify $ if (null fs) then id else (const (tail fs))
Last line could be
modify (drop 1)
-- run the whole 'program' runAll :: Programmable a () runAll = do fs <- get if (null fs) then (return ()) else (step >> runAll)
runAll = do stop <- gets null unless stop (step >> runAll)

Heh. Actually, there is no more rewrite. But I muffed the second definition:
it should, of course be:
type Programmable a = State2 [a->a] a
On Sat, Feb 28, 2009 at 10:35 AM, Andrew Wagner
Thanks for helping clean up my dirty little hacking. This could actually be made nicer by defining the following, and rewriting the original code in terms of it:
type State2 a b = StateT a (State b) type Programmable a = State2 a (a->a)
I'll leave the rewrite as an exercise for the reader, since I'm standing in the store writing this on my iPhone :)
On Feb 28, 2009, at 10:08 AM, Daniel Fischer
wrote: Am Samstag, 28. Februar 2009 15:36 schrieb Andrew Wagner:
Ok, so this question of stacking state on top of state has come up several times lately. So I decided to whip up a small example. So here's a goofy little example of an abstract representation of a computer that can compute a value of type 'a'. The two states here are a value of type 'a', and a stack of functions of type (a->a) which can be applied to that value.
Nice.
Disclaimer: this code is only type-checked, not tested!
import Control.Monad.State
import Control.Moand (unless)
-- first, we'll rename the type, for convenience type Programmable a = StateT [a->a] (State a)
-- add a function to the stack of functions that can be applied -- notice that we just use the normal State functions when dealing -- with the first type of state add :: (a -> a) -> Programmable a () add f = modify (f:)
-- add a bunch of functions to the stack -- this time, notice that Programmable a is just a monad addAll :: [a -> a] -> Programmable a () addAll = mapM_ add
Be aware that this adds the functions in reverse order, an alternative is
addAll = modify . (++)
(addAll fs = modify (fs ++))
-- this applies a function directly to the stored state, bypassing the function stack -- notice that, to use State functions on the second type of state, we must use -- lift to get to that layer modify' :: (a -> a) -> Programmable a () modify' f = lift (modify f)
-- pop one function off the stack and apply it -- notice again the difference between modify' and modify. we use modify' to modify the value -- and modify to modify the function stack. This is again because of the order in which we wrapped -- the two states. If we were dealing with StateT a (State [a->a]), it would be the opposite. step :: Programmable a () step = do fs <- get let f = if (null fs) then id else (head fs) modify' f modify $ if (null fs) then id else (const (tail fs))
Last line could be
modify (drop 1)
-- run the whole 'program' runAll :: Programmable a () runAll = do fs <- get if (null fs) then (return ()) else (step >> runAll)
runAll = do stop <- gets null unless stop (step >> runAll)

Hi,
Thanks for the replies - I haven't had a chance to try out everything
suggested yet - but your explanations of transformers nailed it for me.
However, in terms of performance when stacking, I've come across something
I'm struggling to explain - I was wondering if anyone could offer up and
explanation.
I've rewritten my code twice - one with 3 stacked monads, and one with 2
stacked monads and a load of maps. Heuristically I would have thought the 3
stacked monads would have performed as well, or even better than the 2
stacked solution, but the 2 stacked solution is MUCH faster and MUCH less
memory is used. They are both using 90% of the same code and both chain
together the same number of computations using replicateM. From profiling I
can see that the pure function 'reflect' takes up most of the umph in both
cases - which I'd expect. But in the triple stacked version the garbage
collector is using up >90% of the time.
I've tried using BangPatterns to reduce memory usage in the Triple Stack
version - doing this I can half the time it takes, but it is still running
at over twice the time of the two stack version. The BangPatterns were also
put in Common Code in the reflect function - so I'd expect both solutions to
need them?
Even though both pieces of code are a bit untidy, the triple stacked monad
'feels' nicer to me - everything is encapsulated away and one evaluation in
main yields the result. From purely a design perspective I prefer it - but
obviously not if it runs like a dog!
Any ideas why the triple stack runs so slow?
Thanks again!
Phil
***************** Triple Stack Specific Impl:
type MonteCarloStateT = StateT Double
mc :: MonteCarloStateT BoxMullerQuasiState ()
mc = StateT $ \s -> do nextNormal <- generateNormal
let stochastic = 0.2*1*nextNormal
let drift = 0.05 - (0.5*(0.2*0.2))*1
let newStockSum = payOff 100 ( 100 * exp ( drift +
stochastic ) ) + s
return ((),newStockSum)
iterations = 1000000
main :: IO()
main = do let sumOfPayOffs = evalState ( evalStateT ( execStateT (do
replicateM_ iterations mc) $ 0 ) $ (Nothing,nextHalton) ) $ (1,[3,5])
let averagePO = sumOfPayOffs / fromIntegral iterations
let discountPO = averagePO * exp (-0.05)
print discountPO
***************** Double Stack and Map Specific Impl:
iterations = 1000000
main :: IO()
main = do let normals = evalState ( evalStateT (do replicateM iterations
generateNormal) $ (Nothing,nextHalton) ) $ (1,[3,5])
let stochastic = map (0.2*1*) normals
let sde = map ((( 0.05 - (0.5*(0.2*0.2)) )*1)+) stochastic
let expiryMult = map exp sde
let expiry = map (100*) expiryMult
let payoff = map (payOff 100) expiry
let averagePO = (foldr (+) 0 payoff) / fromIntegral iterations
let discountPO = averagePO * exp (-0.05)
print discountPO
***************** Common Code Used By Both Methods:
import Control.Monad.State
import Debug.Trace
-- State Monad for QRNGs - stores current iteration and list of
-- bases to compute
type QuasiRandomState = State (Int,[Int])
nextHalton :: QuasiRandomState [Double]
nextHalton = do (n,bases) <- get
let !nextN = n+1
put (nextN,bases)
return $ map (reflect (n,1,0)) bases
type ReflectionThreadState = (Int,Double,Double)
reflect :: ReflectionThreadState -> Int -> Double
reflect (k,f,h) base
| k <= 0 = h
| otherwise = reflect (newK,newF,newH) base
where
newK = k `div` base
newF = f / fromIntegral base
newH = h + fromIntegral(k `mod` base) * newF
-- So we are defining a state transform which has state of 'maybe double'
and an
-- operating function for the inner monad of type QuasiRandomMonad returning
a [Double]
-- We then say that it wraps an QuasiRandomMonad (State Monad) - it must of
course
-- if we pass it a function that operates on these Monads we must wrap the
same
-- type of Monad. And finally it returns a double
type BoxMullerStateT = StateT (Maybe Double, QuasiRandomState [Double])
type BoxMullerQuasiState = BoxMullerStateT QuasiRandomState
generateNormal :: BoxMullerQuasiState Double
generateNormal = StateT $ \s -> case s of
(Just d,qrnFunc) -> return (d,(Nothing,qrnFunc))
(Nothing,qrnFunc) -> do qrnBaseList <- qrnFunc
let (norm1,norm2) = boxMuller (head
qrnBaseList) (head $ tail qrnBaseList)
return (norm1,(Just norm2,qrnFunc))
boxMuller :: Double -> Double -> (Double,Double)
-- boxMuller rn1 rn2 | trace ( "rn1 " ++ show rn1 ++ " rn2 " ++ show rn2 )
False=undefined
boxMuller rn1 rn2 = (normal1,normal2)
where
r = sqrt ( (-2)*log rn1)
twoPiRn2 = 2 * pi * rn2
normal1 = r * cos ( twoPiRn2 )
normal2 = r * sin ( twoPiRn2 )
payOff :: Double -> Double -> Double
payOff strike stock | (stock - strike) > 0 = stock - strike
| otherwise = 0
On 28/02/2009 13:31, "Daniel Fischer"
Am Samstag, 28. Februar 2009 13:23 schrieb Phil:
Hi,
The problem is HOW DO I WRAP ANOTHER INDEPENDENT STATE AROUND THIS?
After some googling it looked like the answer may be Monad Transformers. Specifically we could add a StateT transform for our Box Muller state to our VanDerCorput State Monad. Google didn¹t yield a direct answer here so I¹m not even sure if my thinking is correct, people describe the process of using a transform as wrapping one monad in another¹ or threading one monad into another¹. What we want to do is have some internal state controlled by an independent outer state - this sounds about right to me?
If you absolutely don't want to have a state describing both, yes.
<SNIP>

I know very little about profiling, but your comment about spending a lot of
time garbage collecting rang a bell with me - the example on RWH talks about
that exact issue. Thus, I would recommend walking through the profiling
techniques described on
http://book.realworldhaskell.org/read/profiling-and-optimization.html .
On Sun, Mar 1, 2009 at 3:03 PM, Phil
Hi,
Thanks for the replies - I haven't had a chance to try out everything suggested yet - but your explanations of transformers nailed it for me.
However, in terms of performance when stacking, I've come across something I'm struggling to explain - I was wondering if anyone could offer up and explanation. I've rewritten my code twice - one with 3 stacked monads, and one with 2 stacked monads and a load of maps. Heuristically I would have thought the 3 stacked monads would have performed as well, or even better than the 2 stacked solution, but the 2 stacked solution is MUCH faster and MUCH less memory is used. They are both using 90% of the same code and both chain together the same number of computations using replicateM. From profiling I can see that the pure function 'reflect' takes up most of the umph in both cases - which I'd expect. But in the triple stacked version the garbage collector is using up >90% of the time.
I've tried using BangPatterns to reduce memory usage in the Triple Stack version - doing this I can half the time it takes, but it is still running at over twice the time of the two stack version. The BangPatterns were also put in Common Code in the reflect function - so I'd expect both solutions to need them?
Even though both pieces of code are a bit untidy, the triple stacked monad 'feels' nicer to me - everything is encapsulated away and one evaluation in main yields the result. From purely a design perspective I prefer it - but obviously not if it runs like a dog!
Any ideas why the triple stack runs so slow?
Thanks again!
Phil
***************** Triple Stack Specific Impl:
type MonteCarloStateT = StateT Double
mc :: MonteCarloStateT BoxMullerQuasiState () mc = StateT $ \s -> do nextNormal <- generateNormal let stochastic = 0.2*1*nextNormal let drift = 0.05 - (0.5*(0.2*0.2))*1 let newStockSum = payOff 100 ( 100 * exp ( drift + stochastic ) ) + s return ((),newStockSum)
iterations = 1000000 main :: IO() main = do let sumOfPayOffs = evalState ( evalStateT ( execStateT (do replicateM_ iterations mc) $ 0 ) $ (Nothing,nextHalton) ) $ (1,[3,5]) let averagePO = sumOfPayOffs / fromIntegral iterations let discountPO = averagePO * exp (-0.05) print discountPO
***************** Double Stack and Map Specific Impl:
iterations = 1000000 main :: IO() main = do let normals = evalState ( evalStateT (do replicateM iterations generateNormal) $ (Nothing,nextHalton) ) $ (1,[3,5]) let stochastic = map (0.2*1*) normals let sde = map ((( 0.05 - (0.5*(0.2*0.2)) )*1)+) stochastic let expiryMult = map exp sde let expiry = map (100*) expiryMult let payoff = map (payOff 100) expiry let averagePO = (foldr (+) 0 payoff) / fromIntegral iterations let discountPO = averagePO * exp (-0.05) print discountPO
***************** Common Code Used By Both Methods:
import Control.Monad.State import Debug.Trace
-- State Monad for QRNGs - stores current iteration and list of -- bases to compute type QuasiRandomState = State (Int,[Int])
nextHalton :: QuasiRandomState [Double] nextHalton = do (n,bases) <- get let !nextN = n+1 put (nextN,bases) return $ map (reflect (n,1,0)) bases
type ReflectionThreadState = (Int,Double,Double)
reflect :: ReflectionThreadState -> Int -> Double reflect (k,f,h) base | k <= 0 = h | otherwise = reflect (newK,newF,newH) base where newK = k `div` base newF = f / fromIntegral base newH = h + fromIntegral(k `mod` base) * newF
-- So we are defining a state transform which has state of 'maybe double' and an -- operating function for the inner monad of type QuasiRandomMonad returning a [Double] -- We then say that it wraps an QuasiRandomMonad (State Monad) - it must of course -- if we pass it a function that operates on these Monads we must wrap the same -- type of Monad. And finally it returns a double
type BoxMullerStateT = StateT (Maybe Double, QuasiRandomState [Double]) type BoxMullerQuasiState = BoxMullerStateT QuasiRandomState
generateNormal :: BoxMullerQuasiState Double generateNormal = StateT $ \s -> case s of (Just d,qrnFunc) -> return (d,(Nothing,qrnFunc)) (Nothing,qrnFunc) -> do qrnBaseList <- qrnFunc let (norm1,norm2) = boxMuller (head qrnBaseList) (head $ tail qrnBaseList) return (norm1,(Just norm2,qrnFunc))
boxMuller :: Double -> Double -> (Double,Double) -- boxMuller rn1 rn2 | trace ( "rn1 " ++ show rn1 ++ " rn2 " ++ show rn2 ) False=undefined boxMuller rn1 rn2 = (normal1,normal2) where r = sqrt ( (-2)*log rn1) twoPiRn2 = 2 * pi * rn2 normal1 = r * cos ( twoPiRn2 ) normal2 = r * sin ( twoPiRn2 )
payOff :: Double -> Double -> Double payOff strike stock | (stock - strike) > 0 = stock - strike | otherwise = 0
On 28/02/2009 13:31, "Daniel Fischer"
wrote: Am Samstag, 28. Februar 2009 13:23 schrieb Phil:
Hi,
The problem is HOW DO I WRAP ANOTHER INDEPENDENT STATE AROUND THIS?
After some googling it looked like the answer may be Monad Transformers. Specifically we could add a StateT transform for our Box Muller state to our VanDerCorput State Monad. Google didn¹t yield a direct answer here so I¹m not even sure if my thinking is correct, people describe the process of using a transform as Œwrapping one monad in another¹ or Œthreading one monad into another¹. What we want to do is have some internal state controlled by an independent outer state - this sounds about right to me?
If you absolutely don't want to have a state describing both, yes.
<SNIP>

On 01/03/2009 20:16, "Andrew Wagner"

Am Sonntag, 1. März 2009 21:03 schrieb Phil:
Hi,
Thanks for the replies - I haven't had a chance to try out everything suggested yet - but your explanations of transformers nailed it for me.
However, in terms of performance when stacking, I've come across something I'm struggling to explain - I was wondering if anyone could offer up and explanation. I've rewritten my code twice - one with 3 stacked monads, and one with 2 stacked monads and a load of maps. Heuristically I would have thought the 3 stacked monads would have performed as well, or even better than the 2 stacked solution, but the 2 stacked solution is MUCH faster and MUCH less memory is used. They are both using 90% of the same code and both chain together the same number of computations using replicateM.
Not quite, the triple stack uses replicateM_, whihc should be a little cheaper.
From profiling I can see that the pure function 'reflect' takes up most of the umph in both cases - which I'd expect. But in the triple stacked version the garbage collector is using up >90% of the time.
I've tried using BangPatterns to reduce memory usage in the Triple Stack version - doing this I can half the time it takes, but it is still running at over twice the time of the two stack version. The BangPatterns were also put in Common Code in the reflect function - so I'd expect both solutions to need them?
One thing that helps much is to use import Control.Monad.State.Strict Using the default lazy State monad, you build enormous thunks in the states, which harms the triple stack even more than the double stack. With the strict State monad (and a strict left fold instead of foldr in the double stack), I get ./v6tripleStrict +RTS -sstderr -K16M 10.450674088955589 444,069,720 bytes allocated in the heap 234,808,472 bytes copied during GC (scavenged) 30,504,688 bytes copied during GC (not scavenged) 41,074,688 bytes maximum residency (9 sample(s)) 786 collections in generation 0 ( 21.03s) 9 collections in generation 1 ( 2.54s) 106 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 4.05s ( 4.21s elapsed) GC time 23.57s ( 24.18s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 27.62s ( 28.40s elapsed) %GC time 85.3% (85.2% elapsed) Alloc rate 109,646,844 bytes per MUT second Productivity 14.7% of total user, 14.3% of total elapsed ./v6doubleStrict +RTS -sstderr 10.450674088955592 388,795,972 bytes allocated in the heap 177,748,228 bytes copied during GC (scavenged) 23,953,900 bytes copied during GC (not scavenged) 44,560,384 bytes maximum residency (9 sample(s)) 710 collections in generation 0 ( 11.62s) 9 collections in generation 1 ( 3.03s) 94 Mb total memory in use INIT time 0.01s ( 0.00s elapsed) MUT time 13.54s ( 13.91s elapsed) GC time 14.65s ( 15.02s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 28.20s ( 28.93s elapsed) %GC time 52.0% (51.9% elapsed) Alloc rate 28,693,429 bytes per MUT second Productivity 48.0% of total user, 46.8% of total elapsed So, yes, the triple stack uses more memory, but not terribly much more. However, it spends much more time gc'ing, but as its MUT time is much less, the total times are not much different. Now,if we give them enough heap space to begin with: ./v6tripleStrict +RTS -sstderr -K16M -H192M 10.450674088955589 444,077,972 bytes allocated in the heap 95,828,976 bytes copied during GC (scavenged) 15,441,936 bytes copied during GC (not scavenged) 36,147,200 bytes maximum residency (2 sample(s)) 5 collections in generation 0 ( 2.16s) 2 collections in generation 1 ( 0.43s) 185 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 4.20s ( 4.55s elapsed) GC time 2.59s ( 2.74s elapsed) EXIT time 0.00s ( 0.95s elapsed) Total time 6.79s ( 7.29s elapsed) %GC time 38.1% (37.6% elapsed) Alloc rate 105,732,850 bytes per MUT second Productivity 61.9% of total user, 57.6% of total elapsed ./v6doubleStrict +RTS -sstderr -K16M -H192M 10.450674088955592 388,806,408 bytes allocated in the heap 46,446,680 bytes copied during GC (scavenged) 77,852 bytes copied during GC (not scavenged) 159,744 bytes maximum residency (2 sample(s)) 4 collections in generation 0 ( 1.36s) 2 collections in generation 1 ( 0.03s) 182 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 4.53s ( 5.11s elapsed) GC time 1.39s ( 1.44s elapsed) EXIT time 0.00s ( 0.02s elapsed) Total time 5.92s ( 6.55s elapsed) %GC time 23.5% (21.9% elapsed) Alloc rate 85,829,229 bytes per MUT second Productivity 76.5% of total user, 69.2% of total elapsed MUCH better. I have no idea why the MUT time for the double stack decreases so much, though.
Even though both pieces of code are a bit untidy, the triple stacked monad 'feels' nicer to me - everything is encapsulated away and one evaluation in main yields the result. From purely a design perspective I prefer it - but obviously not if it runs like a dog!
Any ideas why the triple stack runs so slow?
It suffers horribly from laziness. One thing is the lazy State monad, another is the implementation of mc.
Thanks again!
Phil
***************** Triple Stack Specific Impl:
type MonteCarloStateT = StateT Double
mc :: MonteCarloStateT BoxMullerQuasiState () mc = StateT $ \s -> do nextNormal <- generateNormal let stochastic = 0.2*1*nextNormal let drift = 0.05 - (0.5*(0.2*0.2))*1 let newStockSum = payOff 100 ( 100 * exp ( drift + stochastic ) ) + s return ((),newStockSum)
Don't use a new let on each line, have it all in one let-block. And, please, force the evaluation of newStockSum: mc :: MonteCarloStateT BoxMullerQuasiState () mc = StateT $ \s -> do nextNormal <- generateNormal let stochastic = 0.2*1*nextNormal drift = 0.05 - (0.5*(0.2*0.2))*1 !newStockSum = payOff 100 ( 100 * exp ( drift + stochastic ) ) + s return ((),newStockSum) Now: ./v8tripleStrict +RTS -sstderr 10.450674088955589 396,391,172 bytes allocated in the heap 65,252 bytes copied during GC (scavenged) 22,272 bytes copied during GC (not scavenged) 40,960 bytes maximum residency (1 sample(s)) 757 collections in generation 0 ( 0.02s) 1 collections in generation 1 ( 0.00s) 1 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 3.36s ( 3.49s elapsed) GC time 0.02s ( 0.05s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 3.38s ( 3.54s elapsed) %GC time 0.6% (1.3% elapsed) Alloc rate 117,973,563 bytes per MUT second Productivity 99.4% of total user, 94.9% of total elapsed w00t!
iterations = 1000000 main :: IO() main = do let sumOfPayOffs = evalState ( evalStateT ( execStateT (do replicateM_ iterations mc) $ 0 ) $ (Nothing,nextHalton) ) $ (1,[3,5]) let averagePO = sumOfPayOffs / fromIntegral iterations let discountPO = averagePO * exp (-0.05) print discountPO
Again, don't needlessly multiply the lets.
***************** Double Stack and Map Specific Impl:
iterations = 1000000 main :: IO() main = do let normals = evalState ( evalStateT (do replicateM iterations generateNormal) $ (Nothing,nextHalton) ) $ (1,[3,5]) let stochastic = map (0.2*1*) normals let sde = map ((( 0.05 - (0.5*(0.2*0.2)) )*1)+) stochastic let expiryMult = map exp sde let expiry = map (100*) expiryMult let payoff = map (payOff 100) expiry let averagePO = (foldr (+) 0 payoff) / fromIntegral iterations let discountPO = averagePO * exp (-0.05) print discountPO
Same here, but important for performance is to replace the foldr with foldl'.
***************** Common Code Used By Both Methods:
import Control.Monad.State import Debug.Trace
-- State Monad for QRNGs - stores current iteration and list of -- bases to compute type QuasiRandomState = State (Int,[Int])
nextHalton :: QuasiRandomState [Double] nextHalton = do (n,bases) <- get let !nextN = n+1 put (nextN,bases) return $ map (reflect (n,1,0)) bases
type ReflectionThreadState = (Int,Double,Double)
reflect :: ReflectionThreadState -> Int -> Double reflect (k,f,h) base
| k <= 0 = h | otherwise = reflect (newK,newF,newH) base
where newK = k `div` base newF = f / fromIntegral base newH = h + fromIntegral(k `mod` base) * newF
-- So we are defining a state transform which has state of 'maybe double' and an -- operating function for the inner monad of type QuasiRandomMonad returning a [Double] -- We then say that it wraps an QuasiRandomMonad (State Monad) - it must of course -- if we pass it a function that operates on these Monads we must wrap the same -- type of Monad. And finally it returns a double
type BoxMullerStateT = StateT (Maybe Double, QuasiRandomState [Double]) type BoxMullerQuasiState = BoxMullerStateT QuasiRandomState
generateNormal :: BoxMullerQuasiState Double generateNormal = StateT $ \s -> case s of (Just d,qrnFunc) -> return (d,(Nothing,qrnFunc)) (Nothing,qrnFunc) -> do qrnBaseList <- qrnFunc let (norm1,norm2) = boxMuller (head qrnBaseList) (head $ tail qrnBaseList) return (norm1,(Just norm2,qrnFunc))
boxMuller :: Double -> Double -> (Double,Double) -- boxMuller rn1 rn2 | trace ( "rn1 " ++ show rn1 ++ " rn2 " ++ show rn2 ) False=undefined boxMuller rn1 rn2 = (normal1,normal2) where r = sqrt ( (-2)*log rn1) twoPiRn2 = 2 * pi * rn2 normal1 = r * cos ( twoPiRn2 ) normal2 = r * sin ( twoPiRn2 )
payOff :: Double -> Double -> Double payOff strike stock | (stock - strike) > 0 = stock - strike
| otherwise = 0
Cheers, Daniel

Thanks very much for your patient explanations - this has really helped
again!
A few final questions in-line.....
On 01/03/2009 21:46, "Daniel Fischer"
One thing that helps much is to use
import Control.Monad.State.Strict
Using the default lazy State monad, you build enormous thunks in the states, which harms the triple stack even more than the double stack. With the strict State monad (and a strict left fold instead of foldr in the double stack), I get
Ahhh, I see. Just to make sure I understand this the Strict version will evaluate each state as an atomic number. The standard lazy version will create each state as an expression of past states... Consequentially these will grow and grow as state is incremented?
type MonteCarloStateT = StateT Double
mc :: MonteCarloStateT BoxMullerQuasiState () mc = StateT $ \s -> do nextNormal <- generateNormal let stochastic = 0.2*1*nextNormal let drift = 0.05 - (0.5*(0.2*0.2))*1 let newStockSum = payOff 100 ( 100 * exp ( drift + stochastic ) ) + s return ((),newStockSum)
Don't use a new let on each line, have it all in one let-block. And, please, force the evaluation of newStockSum:
I had looked at making this strict (along with the values in the reflect function too), it was making a little bit of difference, but not much. I reckon this is because the improvement was being masked by the lazy state monad. Now that this is corrected, I can see it makes a big difference. One question here tho - if we have made our State strict, will this not result in newStockSum being atomically evaluated when we set the new state? Also on the use of multiple 'let' statements - this has obviously completely passed me by so far! I'm assuming that under one let we only actually create the newStockSum, but with 3 let statements, each is created as a separate entity?
w00t!
You're not joking - this is a textbook example of performance enhancement! It's clearly something I have to keep more in mind.
***************** Double Stack and Map Specific Impl:
iterations = 1000000 main :: IO() main = do let normals = evalState ( evalStateT (do replicateM iterations generateNormal) $ (Nothing,nextHalton) ) $ (1,[3,5]) let stochastic = map (0.2*1*) normals let sde = map ((( 0.05 - (0.5*(0.2*0.2)) )*1)+) stochastic let expiryMult = map exp sde let expiry = map (100*) expiryMult let payoff = map (payOff 100) expiry let averagePO = (foldr (+) 0 payoff) / fromIntegral iterations let discountPO = averagePO * exp (-0.05) print discountPO
Same here, but important for performance is to replace the foldr with foldl'.
Again I understand that foldl' is the strict version of foldl, and as we are summing elements we can use either foldl or foldr. I'm assuming this is another thunk optimisation. Does foldl not actually calculate the sum, but moreover it creates an expression of the form a+b+c+d+e+.... Where foldl' will actually evaluate the expression to an atomic number?
Cheers, Daniel

Am Sonntag, 1. März 2009 23:18 schrieb Phil:
Thanks very much for your patient explanations - this has really helped again!
A few final questions in-line.....
On 01/03/2009 21:46, "Daniel Fischer"
wrote: One thing that helps much is to use
import Control.Monad.State.Strict
Using the default lazy State monad, you build enormous thunks in the states, which harms the triple stack even more than the double stack. With the strict State monad (and a strict left fold instead of foldr in the double stack), I get
Ahhh, I see. Just to make sure I understand this the Strict version will evaluate each state as an atomic number. The standard lazy version will create each state as an expression of past states... Consequentially these will grow and grow as state is incremented?
No, it's not that strict. If it were, we wouldn't need the bang on newStockSum (but lots of applications needing some laziness would break). The Monad instance in Control.Monad.State.Strict is instance (Monad m) => Monad (StateT s m) where return a = StateT $ \s -> return (a, s) m >>= k = StateT $ \s -> do (a, s') <- runStateT m s runStateT (k a) s' fail str = StateT $ \_ -> fail str (In the lazy instance, the second line of the >>= implementation is ~(a,s') <- runStateT m s) The state will only be evaluated if "runStateT m" resp. "runStateT (k a)" require it. However, it is truly separated from the return value a, which is not the case in the lazy implementation. The state is an expression of past states in both implementations, the expression is just much more complicated for the lazy.
type MonteCarloStateT = StateT Double
mc :: MonteCarloStateT BoxMullerQuasiState () mc = StateT $ \s -> do nextNormal <- generateNormal let stochastic = 0.2*1*nextNormal let drift = 0.05 - (0.5*(0.2*0.2))*1 let newStockSum = payOff 100 ( 100 * exp ( drift + stochastic ) ) + s return ((),newStockSum)
Don't use a new let on each line, have it all in one let-block. And, please, force the evaluation of newStockSum:
I had looked at making this strict (along with the values in the reflect function too), it was making a little bit of difference, but not much. I reckon this is because the improvement was being masked by the lazy state monad. Now that this is corrected, I can see it makes a big difference.
Yes, the bang doesn't do anything until the state is inspected. In the lazy state monad, the lazy (~) patterns delay that until the very end, when it has to be evaluated anyway.
One question here tho - if we have made our State strict, will this not result in newStockSum being atomically evaluated when we set the new state?
No, see above, it's not that strict. But as state and return value are now properly separated, we can effectively say "evaluate now".
Also on the use of multiple 'let' statements - this has obviously completely passed me by so far! I'm assuming that under one let we only actually create the newStockSum, but with 3 let statements, each is created as a separate entity?
I think both forms are equivalent, I just find it easier to parse with one 'let'.
w00t!
You're not joking - this is a textbook example of performance enhancement! It's clearly something I have to keep more in mind.
***************** Double Stack and Map Specific Impl:
iterations = 1000000 main :: IO() main = do let normals = evalState ( evalStateT (do replicateM iterations generateNormal) $ (Nothing,nextHalton) ) $ (1,[3,5]) let stochastic = map (0.2*1*) normals let sde = map ((( 0.05 - (0.5*(0.2*0.2)) )*1)+) stochastic let expiryMult = map exp sde let expiry = map (100*) expiryMult let payoff = map (payOff 100) expiry let averagePO = (foldr (+) 0 payoff) / fromIntegral iterations let discountPO = averagePO * exp (-0.05) print discountPO
Same here, but important for performance is to replace the foldr with foldl'.
Again I understand that foldl' is the strict version of foldl, and as we are summing elements we can use either foldl or foldr.
Since addition of floating point numbers is neither associative nor commutative, they can lead to different results, so it might also matter for the result and not only the performance, which you use. But using foldr with a strict combination function on a long list always gives poor performance, you build a thunk of the form a1 + (a2 + (a3 + ........ (an + b) ........)) and before any evaluation can be done, the whole list has to be traversed, requiring O(n) space (beware of stack overflows). If you use foldl, you build a thunk of the form ((...(b + a1) + ...) + an), again requiring O(n) space, unless the compiler sees the value is needed and transforms it into foldl' itself.
I'm assuming this is another thunk optimisation. Does foldl not actually calculate the sum, but moreover it creates an expression of the form a+b+c+d+e+.... Where foldl' will actually evaluate the expression to an atomic number?
What foldl does depends on what the compiler sees. It may build a thunk or it may evaluate it at each step (when summing Ints and compiling with optimisations, chances are good). If you use foldl', at each step the accumulator is evaluated to weak head normal form, for types like Int or Double, that is complete evaluation, but for lists, evaluation goes only so far to determine whether it's [] or (_:_). To get complete evaluation at each step, import Control.Parallel.Strategies result = foldl' f' z xs where f' y x = (f y x) `using` rnf comes in handy.
Cheers, Daniel

Thanks again - one quick question about lazy pattern matching below!
On 01/03/2009 23:56, "Daniel Fischer"
No, it's not that strict. If it were, we wouldn't need the bang on newStockSum (but lots of applications needing some laziness would break).
The Monad instance in Control.Monad.State.Strict is
instance (Monad m) => Monad (StateT s m) where return a = StateT $ \s -> return (a, s) m >>= k = StateT $ \s -> do (a, s') <- runStateT m s runStateT (k a) s' fail str = StateT $ \_ -> fail str
(In the lazy instance, the second line of the >>= implementation is ~(a,s') <- runStateT m s)
The state will only be evaluated if "runStateT m" resp. "runStateT (k a)" require it. However, it is truly separated from the return value a, which is not the case in the lazy implementation. The state is an expression of past states in both implementations, the expression is just much more complicated for the lazy.
I think I get this - so what the lazy monad is doing is delaying the evaluation of the *pattern* (a,s') until it is absolutely required. This means that each new (value,state) is just passed around as a thunk and not even evaluated to the point where a pair is constructed - it's just a blob, and could be anything as far as haskell is concerned. It follows that each new state cannot evaluated even if we make newStockSum strict as (by adding a bang) because the state tuple newStockSum is wrapped in is completely unevaluated - so even if newStockSum is evaluated INSIDE this blob, haskell will still keep the whole chain. Only when we actually print the result is each state required and then each pair is constructed and incremented as described by my transformer. This means that every tuple is held as a blob in memory right until the end of the full simulation. Now with the strict version each time a new state tuple is created, to check that the result of running the state is at least of the form (thunk,thunk). It won't actually see much improvement just doing this because even though you're constructing pairs on-the-fly we are still treating each state in a lazy fashion. Thus right at the end we still have huge memory bloat, and although we will not do all our pair construction in one go we will still value each state after ALL states have been created - performance improvement is therefore marginal, and I'd expect memory usage to be more or less the same as (thunk,thunk) and thunk must take up the same memory. So, we stick a bang on the state. This forces each state to evaluated at simulation time. This allows the garbage collector to throw away previous states as the present state is no longer a composite of previous states AND each state has been constructed inside it's pair - giving it Normal form. Assuming that is corrected, I think I've cracked it. One last question if we bang a variable i.e. !x = blah blah, can we assume that x will then ALWAYS be in Normal form or does it only evaluate to a given depth, giving us a stricter WHNF variable, but not necessarily absolutely valued?

Am Montag, 2. März 2009 22:38 schrieb Phil:
Thanks again - one quick question about lazy pattern matching below!
On 01/03/2009 23:56, "Daniel Fischer"
wrote: No, it's not that strict. If it were, we wouldn't need the bang on newStockSum (but lots of applications needing some laziness would break).
The Monad instance in Control.Monad.State.Strict is
instance (Monad m) => Monad (StateT s m) where return a = StateT $ \s -> return (a, s) m >>= k = StateT $ \s -> do (a, s') <- runStateT m s runStateT (k a) s' fail str = StateT $ \_ -> fail str
(In the lazy instance, the second line of the >>= implementation is ~(a,s') <- runStateT m s)
The state will only be evaluated if "runStateT m" resp. "runStateT (k a)" require it. However, it is truly separated from the return value a, which is not the case in the lazy implementation. The state is an expression of past states in both implementations, the expression is just much more complicated for the lazy.
I think I get this - so what the lazy monad is doing is delaying the evaluation of the *pattern* (a,s') until it is absolutely required.
Yes, the lazy bind says give me anything, I'll look at it later (or not). Now it may be that 'k', the function 'm' is bound to, already inspects a, then the pair must be deconstructed, and unless it's a real pair with sufficiently defined first component, you get an error. Or it may be that 'runState(T) (k a)' inspects s', then it's analogous. But if neither really cares, it'll just remain whatever it is until it's needed or thrown away (if some point later in the chain - before anything demanded any evaluation - a 'put 1 >> return 2' appears, we don't need to look at it, so why bother?). A stupid example: ---------------------------------------------------------------------- module UhOh where import Control.Monad import Control.Monad.State.Lazy --import Control.Monad.State.Strict uhOh :: State s () uhOh = State $ \_ -> undefined uhOhT :: Monad m => StateT s m () uhOhT = StateT $ \_ -> return undefined uhOhT2 :: Monad m => StateT s m () uhOhT2 = StateT $ \_ -> undefined oy :: State s () oy = State $ \_ -> ((),undefined) oyT :: Monad m => StateT s m () oyT = StateT $ \_ -> return ((),undefined) hum :: State Int Int hum = do k <- get w <- uhOh put (k+2) return w return (k+1) humT :: Monad m => StateT Int m Int humT = do k <- get w <- uhOhT put (k+2) return w return (k+1) humT2 :: Monad m => StateT Int m Int humT2 = do k <- get w <- uhOhT2 put (k+2) return w return (k+1) whoa n = runState (replicateM_ n hum >> hum) 1 whoaT n = runStateT (replicateM_ n humT >> humT) 1 whoaT2 n = runStateT (replicateM_ n humT2 >> humT2) 1 yum :: State Int Int yum = do k <- get w <- oy put (k+2) return w return (k+1) yumT :: Monad m => StateT Int m Int yumT = do k <- get w <- oyT put (k+2) return w return (k+1) hoha n = runState (replicateM_ n yum >> yum) 1 hohaT n = runStateT (replicateM_ n yumT >> yumT) 1 oops m = runState m 1 ---------------------------------------------------------------------- What happens with whoa 10 hoha 10 oops (whoaT 10) oops (whoaT2 10) oops (hohaT 10) respectively when the Lazy or Strict library is imported? Answer first, then test whether you were right.
This means that each new (value,state) is just passed around as a thunk and not even evaluated to the point where a pair is constructed - it's just a blob, and could be anything as far as haskell is concerned.
Not quite anything, it must have the correct type, but whether it's _|_, (_|_,_|_), (a,_|_), (_|_,s) or (a,s) (where a and s denote non-_|_ elements of the respective types), the (>>=) doesn't care. Whether any evaluation occurs is up to (>>=)'s arguments.
It follows that each new state cannot evaluated even if we make newStockSum strict as (by adding a bang) because the state tuple newStockSum is wrapped in is completely unevaluated - so even if newStockSum is evaluated INSIDE this blob, haskell will still keep the whole chain.
Well, even with the bang, newStockSum will only be evaluated if somebody looks at what mc delivers. In the Strict case, (>>=) does, so newStockSum is evaluated at each step. In the Lazy case, (>>=) doesn't, replicateM_ doesn't, so newStockSum won't be evaluated inside the blob, if it were, it would force the evaluation of the previous pair and almost everything else, then there would have been no problem. What the bang does in the lazy case is to keep the thunk for the evaluation of the states a little smaller and simpler, so the evaluation is a bit faster and uses less memory, but not much (further strictness elsewhere helps, too, as you've investigated).
Only when we actually print the result is each state required and then each pair is constructed and incremented as described by my transformer. This means that every tuple is held as a blob in memory right until the end of the full simulation. Now with the strict version each time a new state tuple is created, to check that the result of running the state is at least of the form (thunk,thunk). It won't actually see much improvement just doing this because even though you're constructing pairs on-the-fly we are still treating each state in a lazy fashion. Thus right at the end we still have huge memory bloat, and although we will not do all our pair construction in one go we will still value each state after ALL states have been created - performance improvement is therefore marginal, and I'd expect memory usage to be more or less the same as (thunk,thunk) and thunk must take up the same memory.
Yes.
So, we stick a bang on the state. This forces each state to evaluated at simulation time. This allows the garbage collector to throw away previous states as the present state is no longer a composite of previous states AND each state has been constructed inside it's pair - giving it Normal form.
Weak head normal form, actually, but since the state is a Double, the two coincide.
Assuming that is corrected, I think I've cracked it.
One last question if we bang a variable i.e. !x = blah blah, can we assume that x will then ALWAYS be in Normal form or does it only evaluate to a given depth, giving us a stricter WHNF variable, but not necessarily absolutely valued?
Bang patterns and seq, as well as case x of { CTOR y -> ... } evaluate xto WHNF, if you want normal form, you must take sterner measures (e.g., (`using` rnf) from Control.Parallel.Strategies).

I've had a look at your example - it's raised yet more questions in my mind!
On 02/03/2009 23:36, "Daniel Fischer"
A stupid example: ---------------------------------------------------------------------- module UhOh where
import Control.Monad import Control.Monad.State.Lazy --import Control.Monad.State.Strict
uhOh :: State s () uhOh = State $ \_ -> undefined
uhOhT :: Monad m => StateT s m () uhOhT = StateT $ \_ -> return undefined
uhOhT2 :: Monad m => StateT s m () uhOhT2 = StateT $ \_ -> undefined
oy :: State s () oy = State $ \_ -> ((),undefined)
oyT :: Monad m => StateT s m () oyT = StateT $ \_ -> return ((),undefined)
hum :: State Int Int hum = do k <- get w <- uhOh put (k+2) return w return (k+1)
humT :: Monad m => StateT Int m Int humT = do k <- get w <- uhOhT put (k+2) return w return (k+1)
humT2 :: Monad m => StateT Int m Int humT2 = do k <- get w <- uhOhT2 put (k+2) return w return (k+1)
whoa n = runState (replicateM_ n hum >> hum) 1
whoaT n = runStateT (replicateM_ n humT >> humT) 1
whoaT2 n = runStateT (replicateM_ n humT2 >> humT2) 1
yum :: State Int Int yum = do k <- get w <- oy put (k+2) return w return (k+1)
yumT :: Monad m => StateT Int m Int yumT = do k <- get w <- oyT put (k+2) return w return (k+1)
hoha n = runState (replicateM_ n yum >> yum) 1
hohaT n = runStateT (replicateM_ n yumT >> yumT) 1
oops m = runState m 1 ----------------------------------------------------------------------
What happens with
whoa 10 hoha 10 oops (whoaT 10) oops (whoaT2 10) oops (hohaT 10)
respectively when the Lazy or Strict library is imported? Answer first, then test whether you were right.
OK, I had a think about this - I'm not 100% clear but: UhOh - OK for lazy, Bad for Strict. "undefined" 'could' be of the form (a,s) so the lazy accepts it, but the strict version tries to produce (a,s) out of undefined and fails. Oy - Both are OK here. The pair form is retained and neither will go as far as to analyse the contents of either element of the pair, as neither is used. UhOhT - OK for lazy, Bad for Strict. Same as Oh UhOh, but as we have transformer we return inside a Monad. UhOhT2 - Bad for both - transformers should return a Monad. OyT - Same as Oy, but returned inside a monad. The thing which confuses me is why we care about these functions at all hum, yum, etc. Although these inspect the State Monads above they stick the values in to 'w' which is never used (I think), because the first return statement just produces "M w" which is not returned because of the return (k+1) afterwards?? Because lazy and strict are only separated by the laziness on the bind between contiguous hum and yum states, I would have thought that laziness on w would have been the same on both. Hmmm. But I suppose each call to hum and yum is increment stating in it's corresponding UhOh and Oy function. Thus causing these to be strictly evaluated one level deeper.... In which case I do understand. We have: hum >> hum >> hum ..... And At each stage we are also doing UhOh >> UhOh >> UhOh inside the hums? Is this right, I'm not so sure? I'm in danger of going a bit cross-eyed here!
This means that each new (value,state) is just passed around as a thunk and not even evaluated to the point where a pair is constructed - it's just a blob, and could be anything as far as haskell is concerned.
Not quite anything, it must have the correct type, but whether it's _|_, (_|_,_|_), (a,_|_), (_|_,s) or (a,s) (where a and s denote non-_|_ elements of the respective types), the (>>=) doesn't care. Whether any evaluation occurs is up to (>>=)'s arguments.
By correct type you mean that it must *feasibly* be a pair... But the lazy pattern matching doesn't verify that it *is* a pair. Thus if we returned something that could never be a pair, it will fail to compile, but if it is of the form X or (X,X) it won't check any further than that, but if it was say [X] that wouldn't work even for lazy - haskell doesn't trust us that much!?
It follows that each new state cannot evaluated even if we make newStockSum strict as (by adding a bang) because the state tuple newStockSum is wrapped in is completely unevaluated - so even if newStockSum is evaluated INSIDE this blob, haskell will still keep the whole chain.
Well, even with the bang, newStockSum will only be evaluated if somebody looks at what mc delivers. In the Strict case, (>>=) does, so newStockSum is evaluated at each step.
When you say 'looks' at it do you mean it is the final print state on the result that ultimately causes the newStockSum to be evaluated in the lazy version? Thus we are saying we evaluate it only because we know it is needed. However in the strict case, the fact that newStockSum is used to evaluate the NEXT newStockSum in the subsequent state (called via the bind) is enough to force evaluation, even if the result of the subsequent state is not used?
In the Lazy case, (>>=) doesn't, replicateM_ doesn't, so newStockSum won't be evaluated inside the blob, if it were, it would force the evaluation of the previous pair and almost everything else, then there would have been no problem. What the bang does in the lazy case is to keep the thunk for the evaluation of the states a little smaller and simpler, so the evaluation is a bit faster and uses less memory, but not much (further strictness elsewhere helps, too, as you've investigated).
So in the lazy state the bang will evaluate things that are local to THIS state calculation, but it won't force evaluation of previous states. Thus expression remaining could be simplified as far as possible without requiring the previous MonteCarlo state or the previous BoxMuller state.

Am Dienstag, 3. März 2009 23:28 schrieb Phil:
I've had a look at your example - it's raised yet more questions in my mind!
On 02/03/2009 23:36, "Daniel Fischer"
wrote: A stupid example: ---------------------------------------------------------------------- module UhOh where
import Control.Monad import Control.Monad.State.Lazy --import Control.Monad.State.Strict
uhOh :: State s () uhOh = State $ \_ -> undefined
uhOhT :: Monad m => StateT s m () uhOhT = StateT $ \_ -> return undefined
uhOhT2 :: Monad m => StateT s m () uhOhT2 = StateT $ \_ -> undefined
oy :: State s () oy = State $ \_ -> ((),undefined)
oyT :: Monad m => StateT s m () oyT = StateT $ \_ -> return ((),undefined)
hum :: State Int Int hum = do k <- get w <- uhOh put (k+2) return w return (k+1)
humT :: Monad m => StateT Int m Int humT = do k <- get w <- uhOhT put (k+2) return w return (k+1)
humT2 :: Monad m => StateT Int m Int humT2 = do k <- get w <- uhOhT2 put (k+2) return w return (k+1)
whoa n = runState (replicateM_ n hum >> hum) 1
whoaT n = runStateT (replicateM_ n humT >> humT) 1
whoaT2 n = runStateT (replicateM_ n humT2 >> humT2) 1
yum :: State Int Int yum = do k <- get w <- oy put (k+2) return w return (k+1)
yumT :: Monad m => StateT Int m Int yumT = do k <- get w <- oyT put (k+2) return w return (k+1)
hoha n = runState (replicateM_ n yum >> yum) 1
hohaT n = runStateT (replicateM_ n yumT >> yumT) 1
oops m = runState m 1 ----------------------------------------------------------------------
What happens with
whoa 10 hoha 10 oops (whoaT 10) oops (whoaT2 10) oops (hohaT 10)
respectively when the Lazy or Strict library is imported? Answer first, then test whether you were right.
OK, I had a think about this - I'm not 100% clear but:
UhOh - OK for lazy, Bad for Strict. "undefined" 'could' be of the form (a,s) so the lazy accepts it, but the strict version tries to produce (a,s) out of undefined and fails.
Correct.
Oy - Both are OK here. The pair form is retained and neither will go as far as to analyse the contents of either element of the pair, as neither is used.
Correct.
UhOhT - OK for lazy, Bad for Strict. Same as Oh UhOh, but as we have transformer we return inside a Monad.
Correct.
UhOhT2 - Bad for both - transformers should return a Monad.
Mostly correct, but, using Lazy: *UhOh> oops (whoaT2 10) ((22,23),*** Exception: Prelude.undefined *UhOh> evalState (runStateT humT2 4) 0 (5,6) *UhOh> let putI :: Int -> State Int (); putI = put *UhOh> let see = do { k <- get; w <- uhOhT2; put (k+2); lift (putI 4); return w; return (k+1) } *UhOh> oops $ runStateT see 3 ((4,5),4) So if the inner monad is lazy enough, it can happily pass over the undefined. Let's look at what happens if we bind uhOhT2 to f :: () -> StateT s m b (where m is some monad). uhOhT2 >>= f = StateT $ \so -> do ~(a,so') <- runStateT uhOhT2 so runStateT (f a) so' = StateT $ \so -> do ~(ao,so') <- undefined runStateT (f a) so' = StateT $ \so -> undefined >>= \x -> let (a,so') = x in runStateT (f a) so' Now if f doesn't inspect a and runStateT (f a) doesn't inspect so', the only one left to raise an objection is the (>>=) of the inner monad m (let bindings are lazy, so the let (a,so') = x won't fail on undefined, only if f needs to analyse a or runStateT (f a) needs to analyse so' will failure occur). If m is Maybe or [], the inner bind fails and so everything fails. But if m is a lazy State monad, we see that undefined >>= g = State $ \si -> let (b, si') = runState undefined si in runState (g b) si' and only if g needs to analyse b or runState (g b) needs to analyse si' we will fail. When g is \x -> let (a,so') = x in runStateT (f a) so', it only needs to analyse b if f needs to analyse a or runStateT (f a) needs to analyse so'. Since I've avoided that, the put repairs the outer state (and in 'see', the 'lift $ putI 4' also repairs the inner state) and no harm's done.
OyT - Same as Oy, but returned inside a monad.
Sure.
The thing which confuses me is why we care about these functions at all hum, yum, etc. Although these inspect the State Monads above they stick the values in to 'w' which is never used (I think), because the first return statement just produces "M w" which is not returned because of the return (k+1) afterwards??
Yes, the w is never really used, makes no difference at all.
Because lazy and strict are only separated by the laziness on the bind between contiguous hum and yum states, I would have thought that laziness on w would have been the same on both.
Yes.
Hmmm. But I suppose each call to hum and yum is increment stating in it's corresponding UhOh and Oy function. Thus causing these to be strictly evaluated one level deeper.... In which case I do understand.
I'm sorry, I don't understand the above :( In oy(T), we do something moderately bad, we inject an undefined into the state. But we fix it before anybody had a chance to see it in yum(T) by immediately following it by a put. It's a bit like map (const 1) $ map (const undefined) list , the intermediate result would be harmful, but it doesn't persist. In uhOh(T), we do something bad, we let the complete (value,state) pair be undefined. If nobody looks whether it's a real pair before we fix it by the put, again no harm is done. The strict state monad checks if it's a real pair (whatever, whatever else), finds that it isn't and bombs out. The lazy state monad says "whatever, I'll assume it's okay until I need to inspect it". It never needs to inspect it, so it's ignored. In uhOhT2, we do something even worse, we let the inner-monadic value be undefined. The strict state transformer monad says "give me a pair", undefined can't, bomb out. The lazy state transformer monad says "if the inner monad's bind asks you for a pair, hand it over, please" and passes the undefined to the inner monad's bind without inspecting it.
We have:
hum >> hum >> hum .....
And At each stage we are also doing UhOh >> UhOh >> UhOh inside the hums?
Is this right, I'm not so sure? I'm in danger of going a bit cross-eyed here!
We're doing an uhOh sandwiched between a get and a put in each hum, so it's rather get >> uhOh >> put x >> get >> uhOh >> ... Now the point is, in the lazy monad we have (uhOh >> put x) === put x, but not in the strict monad.
This means that each new (value,state) is just passed around as a thunk and not even evaluated to the point where a pair is constructed - it's just a blob, and could be anything as far as haskell is concerned.
Not quite anything, it must have the correct type, but whether it's _|_, (_|_,_|_), (a,_|_), (_|_,s) or (a,s) (where a and s denote non-_|_ elements of the respective types), the (>>=) doesn't care. Whether any evaluation occurs is up to (>>=)'s arguments.
By correct type you mean that it must *feasibly* be a pair... But the lazy pattern matching doesn't verify that it *is* a pair. Thus if we returned something that could never be a pair, it will fail to compile,
Yes, it may have type forall a. a, or it may have type forall a b. (a,b), or it may have a more restricted pair type. If the bound function k has type a -> State s b , the type of the thing must be unifyable with (a,s). If it's not, the code won't compile. If it is, the lazy pattern matching will not even verify that the thing exists (cf. uhOhT2).
but if it is of the form X or (X,X) it won't check any further than that, but if it was say [X] that wouldn't work even for lazy - haskell doesn't trust us that much!?
It follows that each new state cannot evaluated even if we make newStockSum strict as (by adding a bang) because the state tuple newStockSum is wrapped in is completely unevaluated - so even if newStockSum is evaluated INSIDE this blob, haskell will still keep the whole chain.
Well, even with the bang, newStockSum will only be evaluated if somebody looks at what mc delivers. In the Strict case, (>>=) does, so newStockSum is evaluated at each step.
When you say 'looks' at it do you mean it is the final print state on the result that ultimately causes the newStockSum to be evaluated in the lazy version?
In this case, yes. In principle, you could have something that forces the evaluation before, e.g. if you replace replicateM_ with a stricter version, replicateM'_ :: (Monad m) => Int -> m a -> m () replicateM'_ k a = sequence'_ (replicate k a) sequence'_ :: (Monad m) => [m a] -> m () sequence'_ (x:xs) = do !a <- x sequence'_ xs sequence'_ [] = return () , the sequence'_ inspects the result () of mc. With the bang on newStockSum, this also forces the evaluation of that, even with Control.Monad.State.Lazy. In the few test I ran, the combination State.Lazy and replicateM'_ was about 6% slower and allocated ~8% more than State.Strict and replicateM_. If you leave off the bang on newStockSum, replicateM'_ doesn't help State.Lazy (perhaps a tiny little bit).
Thus we are saying we evaluate it only because we know it is needed.
That's the point of lazy evaluation. And by using strictness annotations in the right places, we help the compiler because we may know that something will be needed although the compiler can't ascertain it alone.
However in the strict case, the fact that newStockSum is used to evaluate the NEXT newStockSum in the subsequent state (called via the bind) is enough to force evaluation, even if the result of the subsequent state is not used?
With the bang, yes.
In the Lazy case, (>>=) doesn't, replicateM_ doesn't, so newStockSum won't be evaluated inside the blob, if it were, it would force the evaluation of the previous pair and almost everything else, then there would have been no problem. What the bang does in the lazy case is to keep the thunk for the evaluation of the states a little smaller and simpler, so the evaluation is a bit faster and uses less memory, but not much (further strictness elsewhere helps, too, as you've investigated).
So in the lazy state the bang will evaluate things that are local to THIS state calculation, but it won't force evaluation of previous states. Thus expression remaining could be simplified as far as possible without requiring the previous MonteCarlo state or the previous BoxMuller state.
In the lazy state, the bang will cause evaluation of newStockSum when somebody says "hand me a pair, not a blob". Then mc says "Okay, a pair. Now what do I put in the pair? Let me see. Ah, the first component is (), no sweat. And the second component is newStockSum - oh, that's strict, so I have to evaluate it before I can put it into the pair." And thus everything else is forced and happiness ensues :) Cheers, Daniel

Phil wrote:
Again I understand that foldl' is the strict version of foldl, and as we are summing elements we can use either foldl or foldr. I'm assuming this is another thunk optimisation. Does foldl not actually calculate the sum, but moreover it creates an expression of the form a+b+c+d+e+.... Where foldl' will actually evaluate the expression to an atomic number?
It's two things. The first thing is that foldl/foldl' is tail-recursive which can be optimized into a loop in the assembly, and ensures that the fold won't stack overflow (though that says nothing about what the fold builds). The second is that foldl' is strict. That is, rather than building up a large thunk at all, at each step of the recursion we force the accumulator. This is essential to prevent stack overflows when evaluating the accumulator for atomic types like Int or Double. The same number of evaluations happen, but they're ordered differently so they can be amortized across the recursion, rather than being demanded all at once (which can exceed the resources of our non-infinite computers). For non-atomic types the added strictness is less essential. The pyramid shape of http://www.beadling.co.uk/mc2_3stacks.pdf is a canonical image of not letting go of memory incrementally, which is often the result of being too lazy. You see the same shape with, say, building up a very long list, but holding onto the first element (which prevents the GC from cleaning up behind you as you traverse the list). Which is the same as building up a very large computation, but holding off on evaluating it. The image of http://www.beadling.co.uk/mc2_2stacks.pdf looks jittery, but if we zoomed out a bit we'd see that it's a flat line, which means we're freeing memory as quickly as we're allocating it, rather than building up anything large in memory. Ideally, all programs should look like this. -- Live well, ~wren

Phil wrote: | After some googling it looked like the answer may be Monad Transformers. | Specifically we could add a StateT transform for our Box Muller state to our | VanDerCorput State Monad. | Google didn¹t yield a direct answer here so I¹m not even sure if my | thinking is correct, Ignoring Daniel Fischer's astute observation that you can generalize the idea to directly describe the stream ;) The sample code you're looking for is:
{-# LANGUAGE GeneralizedNewtypeDeriving #-} import Control.Monad.State
newtype VanDerCorput a = VDC (State Int a) deriving Monad
runVDC :: VanDerCorput a -> a runVDC (VDC sa) = evalState sa 1
getVDC :: VanDerCorput Int getVDC = VDC $ do x <- get put (f x) return x where f = (1+)
newtype BoxMuller a = BM (StateT (Maybe Int) VanDerCorput a) deriving Monad
runBM :: BoxMuller a -> a runBM (BM vsa) = runVDC (evalStateT vsa Nothing)
getBM :: BoxMuller Int getBM = BM $ do saved <- get case saved of Just x -> put Nothing >> return x Nothing -> do a <- lift getVDC b <- lift getVDC put (Just (f a b)) return (g a b) where -- or whatever... f = const g = const id
-- Live well, ~wren
participants (4)
-
Andrew Wagner
-
Daniel Fischer
-
Phil
-
wren ng thornton