state updates and efficiency

I am writing some code with complex nested state. I have a question about performance with respect to the State monad and the Reader monad. This is somewhat long, so a quick summary of the question up front: Can the compiler optimize out state updates that dont change the state? The question inline with the code:
module Test where import Control.Monad.State import Control.Monad.Reader
I have some nested state. This is a simplified example where we have one record inside another. In my real-world example there is more nesting and there are lists and maps involved as well.
data T1 = T1 { f1 :: Int, f2 :: T2 } deriving(Show) data T2 = T2 { f3 :: Int, f4 :: Int } deriving(Show)
I want to build generic modifiers and reuse them often. A good example is modifying a numeric value:
adjNum :: (Num a) => a -> State a () adjNum n = modify (+ n)
I'm going to be writing state code for my T1 structure which is my master structure. If I'm going to be able to reuse adjNum I am going to have to run a nested state action inside an enclosing state monad. I can build a lifter that does this as long as I know how to extract the nested state and set it back in the enclosing state:
withInnerM :: (o -> i) -> (i -> o -> o) -> State i a -> State o a withInnerM gettor settor act = do outer <- get let inner = gettor outer (ret, inner') = runState act inner outer' = settor inner' outer put outer' return ret
Now we can make lifters for each of the fields:
withF1M = withInnerM f1 (\f r -> r {f1=f}) withF2M = withInnerM f2 (\f r -> r {f2=f}) withF3M = withInnerM f3 (\f r -> r {f3=f}) withF4M = withInnerM f4 (\f r -> r {f4=f})
which lets us write some state code for T1 using building blocks like adjNum. For example, the following code will add a value to f1, add another value to f2's f3 and finally return the value of f2's f4:
tweakT1 :: Int -> Int -> State T1 Int tweakT1 v1 v3 = do withF1M $ adjNum v1 withF2M $ withF3M $ adjNum v3 withF2M $ withF4M $ get
My question here has to do with efficiency. In order to update f2's f3 a new T2 had to be constructed and used to construct a new T1. There's no way around this (I assume). But when doing a mere get of f4, there's no reason why we should have to build a new T2 and then a new T1 since nothign changed. But that's how the code is written. We could write a different lifter that does not perform a state put on the return path. If we did this with the state monad then someone could accidentally use that non-modify version of the lifter and lose an update. So instead I chose to use the Reader monad and let the type system enforce the difference between lifters that modify (M) and those that just read (R). This involved two kinds of lifters. The first lifter runs a Reader action inside of a State monad:
withRead :: Reader s a -> State s a withRead act = do s <- get return $ runReader act s
The second lifter runs a nested Reader action inside of an enclosing Reader monad. It is similar in spirit to withInnerM:
withInnerR :: (o -> i) -> Reader i a -> Reader o a withInnerR gettor act = do i <- asks gettor return $ runReader act i
again we can generate lifters for each field:
withF1R = withInnerR f1 withF2R = withInnerR f2 withF3R = withInnerR f3 withF4R = withInnerR f4
And finally we can use a reader monad to eliminate the extra state updates from our previous tweakT1 implementation:
tweakT1' :: Int -> Int -> State T1 Int tweakT1' v1 v3 = do withF1M $ adjNum v1 withF2M $ withF3M $ adjNum v3 withRead $ withF2R $ withF4R $ ask
main = do let x = T1 1 (T2 3 4) print $ runState (tweakT1 5 6) x print $ runState (tweakT1' 5 6) x
My question here is: is it worth it? If I run
v1 = T2 1 1 v2 = v1 {f3 = 1}
is the compiler smart enough to notice that the record update doesn't result in a change, and avoid constructing an entirely new T2? If so, then I think the original implementation TweakT1 would be about as efficient as the more complicated TweakT1'. Otherwise, I think the latter would be a lot more efficient when the state is large and complex. Tim Newsham http://www.thenewsh.com/~newsham/

I'll make a random comment... Tim Newsham wrote:
I am writing some code with complex nested state. I have a question about performance with respect to the State monad and the Reader monad.
A side comment is this code:
instance MonadReader s (State s) where ask = get
local f m = fmap (fst . runState m . f) get
or
local f m = do s <- get put (f s) a <- m put s return a
With an instance like this you can mix in generic MonadReader code. Feel free to replace State with your own instance of MonadState.
This is somewhat long, so a quick summary of the question up front: Can the compiler optimize out state updates that dont change the state?
I am not the compiler expert, but I think this is unlikely. For GHC you should compile with optimizations and dump the Core code to see if it does what you want. (-ddump-simp option I think).
The question inline with the code:
module Test where import Control.Monad.State import Control.Monad.Reader
I have some nested state. This is a simplified example where we have one record inside another. In my real-world example there is more nesting and there are lists and maps involved as well.
data T1 = T1 { f1 :: Int, f2 :: T2 } deriving(Show) data T2 = T2 { f3 :: Int, f4 :: Int } deriving(Show)
I want to build generic modifiers and reuse them often. A good example is modifying a numeric value:
adjNum :: (Num a) => a -> State a () adjNum n = modify (+ n)
This is not much shorthand. But by way of example it is fine.
I'm going to be writing state code for my T1 structure which is my master structure. If I'm going to be able to reuse adjNum I am
^^^^^^^^^^^^^ This is a slightly odd engineering goal.
going to have to run a nested state action inside an enclosing ^^^^^^^^^^^^^^^^^^^^^^^^^ You want performance but this pushes more work to the compiler. If this were all functional instead of Monadic it might be simpler to start with. state monad. I can build a lifter that does this as long as I know how to extract the nested state and set it back in the enclosing state:
withInnerM :: (o -> i) -> (i -> o -> o) -> State i a -> State o a withInnerM gettor settor act = do outer <- get let inner = gettor outer (ret, inner') = runState act inner outer' = settor inner' outer put outer' return ret
Now we can make lifters for each of the fields:
withF1M = withInnerM f1 (\f r -> r {f1=f}) withF2M = withInnerM f2 (\f r -> r {f2=f}) withF3M = withInnerM f3 (\f r -> r {f3=f}) withF4M = withInnerM f4 (\f r -> r {f4=f})
Main *main comment* is to separate manipulating the complex data structure from the State commands. And to make it more abstract: get1,get3,get4 :: T1 -> Int get2 :: T1 -> T2 get1 = f1 get2 = f2 get3 = f3 . get2 get4 = f4 . get2 put1,put3,put4 :: Int -> T1 -> T1 put2 :: T2 -> T1 -> T1 put1 x o = o {f1=x} put2 x o = o {f2=x} put3 x o = mod2 (\o2 -> o2 {f3=x}) o put4 x o = mod2 (\o2 -> o2 {f4=x}) o mod1,mod3,mod4 :: (Int->Int) -> T1 -> T1 mod2 :: (T2->T2) -> T1 -> T1 mod1 f o = put1 (f (get1 o)) o mod2 f o = put2 (f (get2 o)) o mod3 f o = put3 (f (get3 o)) o mod4 f o = put4 (f (get4 o)) o And note the different but important design choice: You don't need to know how the data is nested to access a field. If you insert a T1'and'a'half data field "between" T1 and T2 then you just need to update get2/put2 to fix (get|put|mod)(3|4) and this also fixes all the code that uses any of these functions.
which lets us write some state code for T1 using building blocks like adjNum. For example, the following code will add a value to f1, add another value to f2's f3 and finally return the value of f2's f4:
tweakT1 :: Int -> Int -> State T1 Int tweakT1 v1 v3 = do withF1M $ adjNum v1 withF2M $ withF3M $ adjNum v3 withF2M $ withF4M $ get
The above would break if you added T1'and'a'half since it needs to know the structure of the data. This is why withF3M is not a good abstraction. Now for my version of tweakT1: -- My choice is to use a strict modify that also returns the new value modify' :: (MonadState a m) => (a -> a) -> m a modify' f = do x <- liftM f get put $! x return x -- Now tweakT1 can be a one-liner or longer tweakT1,tweakT1'long :: (MonadState T1 f) => Int -> Int -> f Int tweakT1 v1 v3 = liftM get4 (modify' (mod1 (+ v1) . mod3 (+ v3))) tweakT1'long v1 v3 = do modify (mod1 (+ v1)) modify (mod3 (+ v3)) liftM get4 get If you want something like adjNum, how about adjNum' :: (Num a) => a -> ((a->a) -> s->s) -> State s a adjNum' x mod = modify' (mod (+ x)) tweakT1'adj v1 v3 = do adjNum' v1 mod1 adjNum' v2 mod3 liftM get4 get The compiler may or may not optimize the mod1 and mod3 into one T1 construction instead of two. If you modify parts 1 and 3 of the state in tandem a lot then mod13 g1 g3 o@(T1 {f1=v1,f2=v2@(T2 {f3=v3})}) = o {f1=g1 v1,f2=v2 {f3=g3 v3}} will certainly avoid reconstructing T1 twice.
My question here has to do with efficiency. In order to update f2's f3 a new T2 had to be constructed and used to construct a new T1. There's no way around this (I assume).
True. The number of T1's and T2's constructed is the issue. Reading the -dump-simpl Core text is the best way to check. Using {-# INLINE ... #-} could help.
But when doing a mere get of f4, there's no reason why we should have to build a new T2 and then a new T1 since nothign changed. But that's how the code is written.
The "run a nested state action" decision combined with always building on withInnerM with calls put is the culprit.
We could write a different lifter that does not perform a state put on the return path. If we did this with the state monad then someone could accidentally use that non-modify version of the lifter and lose an update. So instead I chose to use the Reader monad and let the type system enforce the difference between lifters that modify (M) and those that just read (R). This involved two kinds of lifters.
The first lifter runs a Reader action inside of a State monad:
withRead :: Reader s a -> State s a withRead act = do s <- get return $ runReader act s
That type signature is withReader :: Reader T1 a -> State T1 a withReader act = liftM (runReader act) ask -- or withReader act = liftM (runReader act) get but this is not directly helpful. Running nested monads is not the best course.
The second lifter runs a nested Reader action inside of an enclosing Reader monad. It is similar in spirit to withInnerM:
withInnerR :: (o -> i) -> Reader i a -> Reader o a withInnerR gettor act = do i <- asks gettor return $ runReader act i
again we can generate lifters for each field:
withF1R = withInnerR f1 withF2R = withInnerR f2 withF3R = withInnerR f3 withF4R = withInnerR f4
And finally we can use a reader monad to eliminate the extra state updates from our previous tweakT1 implementation:
tweakT1' :: Int -> Int -> State T1 Int tweakT1' v1 v3 = do withF1M $ adjNum v1 withF2M $ withF3M $ adjNum v3 withRead $ withF2R $ withF4R $ ask
And that was a very roundabout way to derive liftM get4 == withRead . withF2R . withF4R
main = do let x = T1 1 (T2 3 4) print $ runState (tweakT1 5 6) x print $ runState (tweakT1' 5 6) x
My question here is: is it worth it?
If I run
v1 = T2 1 1 v2 = v1 {f3 = 1}
is the compiler smart enough to notice that the record update doesn't result in a change, and avoid constructing an entirely new T2? If so, then I think the original implementation TweakT1 would be about as efficient as the more complicated TweakT1'. Otherwise, I think the latter would be a lot more efficient when the state is large and complex.
Tim Newsham http://www.thenewsh.com/~newsham/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

I'll make a random comment...
Thanks for the comments!
instance MonadReader s (State s) where ask = get local f m = fmap (fst . runState m . f) get
ahh, hadn't thought of doing that.
You want performance but this pushes more work to the compiler. If this were all functional instead of Monadic it might be simpler to start with.
My code has a lot of values in each record, maps of records, records with lists in them, etc. I actually wrote some of the code without using the state monad to start. It was a little complicated. Rewriting it with the state monad made it more modular and shorter (if you dont count all of the lifters I had to write).
withF1M = withInnerM f1 (\f r -> r {f1=f})
Main *main comment* is to separate manipulating the complex data structure from the State commands. And to make it more abstract:
get1,get3,get4 :: T1 -> Int get2 :: T1 -> T2 get1 = f1
And note the different but important design choice: You don't need to know how the data is nested to access a field.
My code doesnt only operate at the top-most layer. There are some complex state transformations that logically occur on one structure thats nested down several layers. So those operations are most naturally written with a State monad on that object. Other operations are written at a higher layer but may still need access to some lower layer state... So, if I were to follow your suggestion, I would have to make several variations of the accessors/modifiers depending on what level I were to access them from. One reason I went with the lifters was to avoid any blow-up in the number of accessors I had to write (and in hopes that later I can auto-generate them).
If you insert a T1'and'a'half data field "between" T1 and T2 then you just need to update get2/put2 to fix (get|put|mod)(3|4) and this also fixes all the code that uses any of these functions.
Yes. This is a downside. I do have some abstractions that try to hide this occasionally...
-- My choice is to use a strict modify that also returns the new value modify' :: (MonadState a m) => (a -> a) -> m a modify' f = do x <- liftM f get put $! x return x
-- Now tweakT1 can be a one-liner or longer tweakT1,tweakT1'long :: (MonadState T1 f) => Int -> Int -> f Int tweakT1 v1 v3 = liftM get4 (modify' (mod1 (+ v1) . mod3 (+ v3)))
I like it.
The compiler may or may not optimize the mod1 and mod3 into one T1 construction instead of two. If you modify parts 1 and 3 of the state in tandem a lot then
mod13 g1 g3 o@(T1 {f1=v1,f2=v2@(T2 {f3=v3})}) = o {f1=g1 v1,f2=v2 {f3=g3 v3}}
not the prettiest function in the bunch. I can see maybe using something like that if profiling said I really really needed it...
withRead $ withF2R $ withF4R $ ask And that was a very roundabout way to derive liftM get4 == withRead . withF2R . withF4R
true, but its composable and usable at all layers. I could have written "withRead $ withF4R $ ask" in a State T2 a monad or "withRead $ withXXX $ withYYY $ withF2R $ withF4R $ ask" at a much higher level... Thanks for your comments. I need to think over my code to figure out if it makes sense to hide the nestings (I suspect the answer is "sometimes"), but I think I can definitely do that in some cases. I'm starting to think that it might make sense for me to eliminate some of the nesting using cross-references (something simple like an integer ID with maps mapping IDs to objects at the top level). Then perhaps most of my state manipulations will occur at the top level (among having other benefits)... Starting to feel more DB-like. Tim Newsham http://www.thenewsh.com/~newsham/
participants (2)
-
Chris Kuklewicz
-
Tim Newsham