
Hello Andrea, Sunday, October 22, 2006, 1:37:55 PM, you wrote:
as Udo said, it should be better to evaluate thunks just when they are created, by using proper 'seq' calls.
While I understand why you and Udo are right, still it is difficult for me to related this discussion to my code. So I wrote a small example that reproduces my problem, with the hope that this will help me understand your point.
sorry, i will not dig into details of its working. i will just explain common rule: if you calc something, make sure that returned value at any attempt to use it will be completely evaluated. it's something rather close to the problems of doing IO in lazy environment, thoroughly described in my IO inside manual for example, this definition: f a b = (a*b,a+b) will return you unevaluated thunk. even when it is used, it may be computed partially: main = do let x = f 1 2 print (fst x) here print will force evaluation of first value in pair, but not the second one. but you can improve it by telling 'if you need value of f, you should evaluate its components first': f a b = let x = a*b y = a+b in x `seq` y `seq` (x,y) this f definition will not evaluate x and y automatically. BUT its returned value is not (x,y). its returned value is x `seq` y `seq` (x,y) and when further computation try to use it in any way, it can't put hands on the pair before it will evaluate x and y values. are you understand? this technique therefore may be used to construct values that will be fully evaluated on any attempt to use any_ part of such value. say, f a = let x = .. y = .. z = .. in x `seq` y `seq` z `seq` T x y z g = let x = f 1 y = f 2 z = f 3 in x `seq` y `seq` z `seq` S x y z here value returned by g on any attempt to use it will force evaluation of two levels of data structure. this can be repeated again and again but writing to file or comparing does the same. that is difference. when you write value to file, you does all evaluations at once but before this moment you have large datastructure full of chunks built in memory. seq-ing technique can help when you build you structure step-by-step in some sequential environment (read monad) so, the following: main = do let x = g return $! (field1 x) will evaluate the whole value returned by g. but it will be equivalent to main = do let x = g'' return $! x==x where g'' - the same function without seqs to get real advantage, you need to build your value sequentially in monad and force evaluation of each step results: main = do let x = f 1 return $! x let y = f 2 return $! y let z = f 3 return $! z let a = T x y z .. so, the seq-ing technique will be better that you current one only in the case when values build are used to build other parts of these datastructure - in this case additional seq's may force evaluation of more part of datastructure than actually requested about your example - this monad is strict, i.e. its operations are sequenced like in IO monad itself (this monad just carries additional state information between IO operations). what you need to do - is to force that any updates/requests will force evaluation. then strictness of monad will guarantee you that evaluation will be perfomed at each step. this will work both for IO and your monad: do let x = a*b return $! x -- here x is evaluated the trick here is that where execution reaches the 'return $!' point, it needs to strictly evaluate value of 'return $! x' expression. this expression, that is equivalent to (x `seq` return x) need to evaluate x before it can return anything! as a result, 'return $! x' is executable statement (like putStr, for example) that force evaluation of x. i.e. it guarantees that x will be executed just at this moment, before executing next statement in 'do' block so, by inserting 'return $!' statements, you may force evaluation of thunks at given lines of your program, and of course you want to evaluate every thunk just at the moment it's created. now we need to look into your program and just find places where thunks are created:
data Mystate = Mystate {mystate :: [String]}
first rule of strict evaluation - use newtype instead of data for one-element constructors. newtype Mystate = Mystate {mystate :: [String]} is equivalent to data Mystate = Mystate {mystate :: ![String]} note '!' for making field strict. if you can't use newtype - make all fields strict next moment - [String] is a [[Char]], i.e. it's a lazy list of lazy lists. you should ensure that both levels of lists are strictly evaluated if you need to avoid thunks
type SL = StateT Mystate IO
getState :: SL [String] getState = do s <- get return $ mystate s
first place where we not strict enough. this expression will return _unevaluated_ thunk (mystate s). but you can force its evaluation before return:
return $! mystate s
so, your code return something like (mystate (Mystate ["a"])) while my code will evaluate this expression, throwing out function call, and return result of function evaluation: ["a"]. I should emphasize that it will not evaluate any thunks inside list of strings. but as long as your state is fully evaluated before storing, this operator will be able to return strictly evaluated value. it is how all this works
setState ns = modify (\s -> s {mystate = ns})
here you modify state, but don't ensure that string list is evaluated on both levels. well, it will be ok if you ensure evaluation at _each_ call to this function. alternatively, you can force evaluation before assignment by: setState ns = do return $! map length ns modify (\s -> s {mystate = ns})
getFile :: String -> SL () getFile p = do f <- liftIO $ readFile p let lns = lines f -- forces evaluation of lns liftIO $ putStrLn $ "Number of lines: " ++ show (length lns) setState lns promptLoop
here you evaluate only higher level of [String] before assigning it to state. but for this operation ('lines' on file contents) it's enough, we don't need to evaluate individual chars here
showLine :: Int -> SL () showLine nr = do s <- getState liftIO $ putStrLn $ s !! nr promptLoop
promptStr = "lFilename [load the file Filename] - sNr [show the line Nr of Filename] - q to quit"
promptLoop :: SL () promptLoop = do liftIO $ putStrLn promptStr str <- liftIO getLine case str of ('l':ss) -> getFile ss ('s':nr) -> showLine (read nr) ('q':[]) -> liftIO $ return ()
you can omit liftIO here, just return () will be ok :)
_ -> promptLoop
main = evalStateT promptLoop $ Mystate []
if StateT is strict monad transformer, this code don't have space leaks. you create thunks in two places, in one place you already evaluate it, and i wrote what to do in second place. for another code, you should watch assignments to state and ensure that assigned values are fully evaluated -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com