
Hello Andrea, Sunday, October 22, 2006, 6:06:24 PM, you wrote:
f a b = let x = a*b y = a+b in x `seq` y `seq` (x,y)
this f definition will not evaluate x and y automatically. BUT its returned value is not (x,y). its returned value is x `seq` y `seq` (x,y) and when further computation try to use it in any way, it can't put hands on the pair before it will evaluate x and y values. are you understand?
Yes, I do understand. But, as far as I know, "seq" will just evaluate x and y enough to see if they are not bottom. So, if x and y are a deep data structure, they won't be evaluated entirely, right?
x (and y) should be not considered here as data value! it's an expression which should be evaluated to bottom or non-bottom, i.e. some data constructor. but what if expression of x (i.e. thunk) by itself contains 'seq' call? say let x = 1/0 `seq` 1 y = x `seq` 2 print y try it ;) it will require to evaluate y value that requires to evaluate x that requires to eval 1/0! so when we construct some value we may ensure that all its parts will be evaluated once some part will be requested: x `seq` y `seq` T x y when you construct data values with seq recursively, this will lead to that whole deep datastructure will be evaluated once any part of it will be requested i don't propose you to do exactly this, just explain how this works and why works
That is to say:
to get real advantage, you need to build your value sequentially in monad and force evaluation of each step results:
main = do let x = f 1 return $! x let y = f 2 return $! y let z = f 3 return $! z let a = T x y z ..
...
setState ns = modify (\s -> s {mystate = ns})
here you modify state, but don't ensure that string list is evaluated on both levels. well, it will be ok if you ensure evaluation at _each_ call to this function. alternatively, you can force evaluation before assignment by:
setState ns = do return $! map length ns modify (\s -> s {mystate = ns})
this is the crucial point. You are forcing evaluation with $! map length, I'm doing it with writeFile. I do not see very much difference. That's an ad hoc solution. Since I need to write the state, instead of (needlessly) looking for the length of it's members, I write it...;-)
yes, it's the same. _right_ way is to use speciall deepSeq function that is like 'seq' but does full evaluation. unfortunately, haskell/ghc don't give us ready-to-use implementation. equality testing and other tricks are just the ways to make full evaluation of this datastructure and they are unnecessary if you really consume it all
By the way, the problem is not ns, but s, the old state. The new state has been evaluated by the code below (when we display the number of lines). So you need to change your code with:
do s <- getState modify (\s -> s {mystate = ns})
otherwise you are going to have the same space leak as my original code had. In other word, it is possible to have a user who keeps on loading files without looking at any line (in this case getState is never called and so there is no "return $! mystate s"!).
this '$!' only suppress one small thunk that contains mystate call on MyState: mystate (MyState s) => s. it don't evaluates anything else, so adding s<-getState will not change anything. if you want to do full evaluatin, you should add the same 'return $! map length state' to the getState, but instead you can add this just to setState :))) so, you don't need to use 'map length' in getState. you don't need to call getState in setState. and you may omit 'map length' call in setState if you will ensure that all calls to it fully evaluate this list. for beginning, in you non-toy code, you may add this call and see that will happen next, this line
modify (\s -> s {mystate = ns})
really produce unevaluated thunk. monad state will point to expression 's {mystate = ns}'. it's better to represent in functional style: setMyState s ns. as you can see, this thunk points to the old value of s. while you (and me) know that this value is really unused, RTS (GHC run-time system) don't know it! as a result, all old file contents are hold in memory until you evaluate this thunk :))) instead of 'modify' you should use the following sequence: s <- get put $! s {mystate = ns} it's my fault - i don't seen this memory leak until we digged into each step of evaluation :) you can run performGC at this moment to ensure that memory will never contain more than two file's contents. if you want to hold no more that one contents, use the following: setState [] performGC s <- getContents setState (lines s) although, i also suggest to add performGC after each substantial step (reading, conversion) and then try to remove one after one and see results
This produces the same as: setState ns = do s <- getState -- that does: return $ mystate s liftIO $ writeFile "/dev/null" s modify (\s -> s {mystate = ns})
as i just understood, the old 's' value is held in memory even after modify, so evaluating it really allow to reduce memory usage :) of course, using put$! should be even better
if StateT is strict monad transformer, this code don't have space leaks. you create thunks in two places, in one place you already evaluate it, and i wrote what to do in second place.
Yes, indeed. But just because we forced evaluation in a way that `seq` cannot do. Am I right?
yes, the strict evaluation order of this monad gives us possibility to evaluate thunks just at some steps of execution instead of linking evaluation to requests of data inside of some structure. strict monad gives us strict context on which we can rely
Thanks a lot for you patience. You did help me a lot. And this is not the first time. I appreciate. Really.
i like to teach :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com