memory, garbage collection and other newbie's issues

Hi! I'm a newbie and, as a learning experience, I'm writing a feed reader with hscurses and hxt. For the present time the feed reader just reads a Liferea cache but, as you can imagine, I'm running into the usual newbie problems of memory consumption and garbage collection, probably (I'm not sure) related to strictness/laziness. Even though I spent a couple of hours search the mailing list archives, I did not come up with something I can relate to, so I'll try to explain my problem. The feed reader, that should be compatible with Liferea, takes an opml (1.0) file, that stores information on folders and subscribed feeds. It uses it as the major component of the state of a ST monad, after adding some attributes used by the reader UI. The UI, that uses the widget library of hscurses and is derived from the Contact Manager example, will just display this opml file, and every UI event (collapsing/expanding of folders, displaying feeds, tagging, flagging, and so on) is just an XML transformation of this opml state component. So, when the feed reader boots, only the layout of folders and subscribed feeds is presented to the user. When the user selects a feed do be displayed, the cached file containing up to 100 saved posts, is read and transformed into a data type (called Feed, obviously). After that this data type is transformed into an opml (xml) tree, that is inserted as a child in the appropriate place of the opml state component. Moreover the parent element of the opml state component (which holds the original information of the subscribed feed) is edited for adding general feed information (such as last update, feed's attributes, and so on) retrieved by reading the file. When the user collapses the feed, the added opml chunk is deleted from the state component (but not the added information to the parent of this chunk). Now, I would expect that after the opml chunk is deleted all the memory allocated for reading the cached file would be garbage collected. This is not happening, so, every time you open (or reopen) a feed, the used memory of the feed reader increases, and never decreases. After profiling I've seen that the problem is occurring in the function that reads the cached file: loadFeed :: String -> IO [Feed] readFeed id = do [a] <- runX $ readDocument [(a_validate, v_0)] (cachePath ++ id) return $ runLA toFeed a What this function does is reading the file with: h <- openFile ... hGetContents h and applying some XML filters to get the Feed type populated with the needed information. I tried making the function strict with $!. I tried using fps. It doesn't change this behaviour, obviously. Now, I know that this is a typical newbie problem: could you give me some direction on how to debug this problem and possibly find a solution? Or just some hints on the kind of problem I'm facing: is it related to strictness/laziness, or it's just that I did not understand a single bit of how garbage collection works in Haskell? Thanks for your kind attention. If I'll be able to solve this problem probably I'll be also able to share the feed reader (I know that you are not going to answer after reading this....;-) regards, Andrea

On 10/18/06, Andrea Rossato
Hi!
I'm a newbie and, as a learning experience, I'm writing a feed reader with hscurses and hxt. For the present time the feed reader just reads a Liferea cache but, as you can imagine, I'm running into the usual newbie problems of memory consumption and garbage collection, probably (I'm not sure) related to strictness/laziness.
Even though I spent a couple of hours search the mailing list archives, I did not come up with something I can relate to, so I'll try to explain my problem.
The feed reader, that should be compatible with Liferea, takes an opml (1.0) file, that stores information on folders and subscribed feeds. It uses it as the major component of the state of a ST monad, after adding some attributes used by the reader UI.
The UI, that uses the widget library of hscurses and is derived from the Contact Manager example, will just display this opml file, and every UI event (collapsing/expanding of folders, displaying feeds, tagging, flagging, and so on) is just an XML transformation of this opml state component.
So, when the feed reader boots, only the layout of folders and subscribed feeds is presented to the user.
When the user selects a feed do be displayed, the cached file containing up to 100 saved posts, is read and transformed into a data type (called Feed, obviously). After that this data type is transformed into an opml (xml) tree, that is inserted as a child in the appropriate place of the opml state component.
Moreover the parent element of the opml state component (which holds the original information of the subscribed feed) is edited for adding general feed information (such as last update, feed's attributes, and so on) retrieved by reading the file.
When the user collapses the feed, the added opml chunk is deleted from the state component (but not the added information to the parent of this chunk).
Now, I would expect that after the opml chunk is deleted all the memory allocated for reading the cached file would be garbage collected. This is not happening, so, every time you open (or reopen) a feed, the used memory of the feed reader increases, and never decreases.
After profiling I've seen that the problem is occurring in the function that reads the cached file:
loadFeed :: String -> IO [Feed] readFeed id = do [a] <- runX $ readDocument [(a_validate, v_0)] (cachePath ++ id) return $ runLA toFeed a
What this function does is reading the file with: h <- openFile ... hGetContents h and applying some XML filters to get the Feed type populated with the needed information.
I tried making the function strict with $!. I tried using fps. It doesn't change this behaviour, obviously.
Have you tried adding strictness annotations to your data type? For example, something like this: data Foo a = Foo !a If you do this for the datatypes you want to get deleted, my understanding is that it will help. I think this helps because the garbage collector will have one less excuse for not cleaning up the values since they will be forced to exist completely when they are created instead of only existing partially (in which case, if I understand correctly, the garbage collector doesn't throw away partially constructed values). You said you did profiling, have you done retainer profiling? I haven't used it myself, but I think that it is designed to help you identify where the memory is being leaked. I've been using haskell on and off for over a year now but still consider myself to be mostly a newbie as well. I started using haskell at work recently on a project and I've found that although haskell makes it so that I don't spend much time debugging (thanks referential transparency!), testing (thanks quickcheck!), or writing code (thanks haskell in general!), I do spend a lot of time profiling and optimizing for time/space. I think it's interesting that the development is mostly quick and easy but polishing and making it ready for general use can still be hard because of performance issues. Although, I'd rather spend my time optimizing something that works than debugging pointer problems. HTH, Jason

Hello Andrea, Wednesday, October 18, 2006, 9:34:28 PM, you wrote:
solution? Or just some hints on the kind of problem I'm facing: is it related to strictness/laziness, or it's just that I did not understand a single bit of how garbage collection works in Haskell?
i think, the second. unfortunately, i don't know good introduction to the actual GC implementation although i can give you a pair of not-so-friendly references: Generational garbage collection for Haskell http://research.microsoft.com/~simonpj/Papers/gen-gc-for-haskell.ps.gz GHC Commentary about GC http://hackage.haskell.org/trac/ghc/wiki/GarbageCollectorNotes http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage shortly speaking, memory allocated by GHC never shrinks. GHC by default use 2-stage memory allocator. First, memory for new objects are allocated in 256k chunks ("nursery", whose size controlled by RTS -A option). when these 256 kb is exhausted, "minor GC" occurs. minor GC scans nursery and moves to the global heap all objects that are still alive. btw, it's great idea because minor GC runs very fast and most objects are died at this stage and don't participate in slow major GCs when all the memory, currently allocated for the heap, exhausted - major GC occurs. it scans all objects in the heap and copy alive ones to the new memory blocks that are allocated during this process. so, for example, if just before major GC we have 30 mb heap where only 10 mb alive, then during GC we will alloc 10 mb more, copy alive obejcts there and at the end will free the original 30 megs but these 30 megs don't returned to the OS! instead, they becomes free part of the heap that then filled with new objects. so, after this major GC we have 40 megs heap of which 30 megs are free. next major GC will occur when these 30 megs will be filled by objects that survived after minor GCs well, it's default scenario. with compacting GC or with -H, -F, -G the scenario will slightly change. you can look at behavior of memory manager using +RTS -Sfile option. here is example: Alloc Collect Live GC GC TOT TOT Page Flts bytes bytes bytes user elap user elap ... 262116 266240 119095740 0.00 0.00 23.72 47.33 0 0 (Gen: 0) 262116 266240 119269488 0.01 0.01 23.73 47.34 0 0 (Gen: 0) 262116 266240 119443236 0.00 0.00 23.73 47.34 0 0 (Gen: 0) 262116 266240 119616988 0.01 0.01 23.74 47.35 0 0 (Gen: 0) 262116 119365632 81473776 1.54 1.66 25.29 49.01 0 0 (Gen: 1) 262116 266240 81647512 0.00 0.00 25.29 49.01 0 0 (Gen: 0) 262116 266240 81821260 0.00 0.00 25.29 49.01 0 0 (Gen: 0) ... each line here reports stats of one GC. '1' in last column means major GC, other are minor ones. third column displays current heap size. as you can see, after each minor GC heap size is increased at 100-200 kb. this means that from 256kb allocated only 100-200k was survived to be moved in the main heap. Heap size was 120 mb and when heap was filled, major GC occurs. after it ends, only 81 mb of data survived, the heap was extended to 120+80=200 megs and next minor GCs continues to fill it up. so, each major GC extends heap (in absence of additional RTS options) and nothing between major GCs can do it. because it was last GC in this run, its stats defined stats of entire program run: 989,191,980 bytes allocated in the heap 467,301,272 bytes copied during GC 81,473,776 bytes maximum residency (9 sample(s)) 3584 collections in generation 0 ( 6.33s) 9 collections in generation 1 ( 3.23s) 194 Mb total memory in use here GHC reports that maximum amount of really used memory was 81 megs (GHC can determine real memory usage only at major GCs), while memory allocated by RTS was 200 megs. so signal/noise ratio is 40% here. sorry, but it is rather typical. there are various techniques to fight with it but in general any memory allocation technique involves a lot of vanished memory. at least GHC GC is enough efficient in terms of time: MUT time 21.73s ( 45.57s elapsed) GC time 9.56s ( 9.94s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 31.31s ( 55.51s elapsed) %GC time 30.6% (17.9% elapsed) ps: if your program uses a lot if string, FPS will be a very great. it don;t change the GC behavior, just makes everything 10 times smaller :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Thu, Oct 19, 2006 at 06:23:35PM +0400, Bulat Ziganshin wrote:
Hello Andrea,
Wednesday, October 18, 2006, 9:34:28 PM, you wrote:
solution? Or just some hints on the kind of problem I'm facing: is it related to strictness/laziness, or it's just that I did not understand a single bit of how garbage collection works in Haskell?
i think, the second. unfortunately, i don't know good introduction to the actual GC implementation although i can give you a pair of not-so-friendly references: [...[
shortly speaking, memory allocated by GHC never shrinks.
Hello Bulat, well, you gave me a wonderfully clear introduction to Haskell GC, and now I have a better understanding of the output of the various profiling I'm doing. Thank you very much! Still, I cannot understand my specific problem, that is to say, why the function that reads a file retains so much memory. I did some test and the results are puzzling: - I tried reading the feed and directly converting it into the opml chunk to be inserted into the opml component of my StateT monad. The problem becomes far worse. Here the output of a heap profile: http://gorgias.mine.nu/haskell/a.out.feed2opml.ps as you can see, after opening one feed (397868 bytes), closing it, opening another one (410052 bytes), closing it and reopening the first one brings memory consumption to 152 Mega. Using the intermediate datatype (that is to say, reading the feed, transforming it into my datatype and then to the opml tree), reduces the problem: http://gorgias.mine.nu/haskell/a.out.feed2feedNotStrict.ps only 92 Mega of memory consumption for the very same operations. Making the intermediate datatype strict gives almost the same results: http://gorgias.mine.nu/haskell/a.out.feed2feedStrict.ps 98 Mega. Now, I come to believe the file reading is indeed strict, and that my problem could be related to StateT laziness. Does this makes sense? I'm now going to try to implement my opml state as a IORef and use a ReaderT monad to see if something new happens.
ps: if your program uses a lot if string, FPS will be a very great. it don;t change the GC behavior, just makes everything 10 times smaller :)
yes, but I'm using HXT and this is using normal strings to store xml text nodes. So I could have some improvements with IO but not that much in memory consumption, unless I totally change my implementation. Anyway, even if I could reduce from 152 to 15 mega the memory consumption for reading 2 feeds, I'd be running out of memory, on my laptop, in one day instead that 5 minutes. Anyway I should face the fact that it is not the string implementation in Haskell that is causing the problem. The problem is probably me! Thanks for your kind attention. Regards Andrea

Hello Andrea, Friday, October 20, 2006, 2:23:16 PM, you wrote:
well, you gave me a wonderfully clear introduction to Haskell GC, and now I have a better understanding of the output of the various profiling I'm doing. Thank you very much!
i had the same problems too :)
Still, I cannot understand my specific problem, that is to say, why the function that reads a file retains so much memory.
I did some test and the results are puzzling: - I tried reading the feed and directly converting it into the opml chunk to be inserted into the opml component of my StateT monad. The problem becomes far worse. Here the output of a heap profile: http://gorgias.mine.nu/haskell/a.out.feed2opml.ps as you can see, after opening one feed (397868 bytes), closing it, opening another one (410052 bytes), closing it and reopening the first one brings memory consumption to 152 Mega.
first, GC don't occurs automatically when you close file. you can help GHC by using performGC from System.Mem. i does it in my own prog second, each Char in ghc occupies 12 bytes (!), so each of your files occupies about 5 mb of memory. if you will count the previous problem, the 2 or 3 files can be hels in memory at the same time (just because they was not yet GCd) so memory usage may become, say, 10 mb multiplying this at 2.5 or even 3 factor which i described in previous letter means, say, 30 mb used and then the only description i've found why 150 megs are used it's because multiple copies of the same data is hold in different forms - i.e. first copy is original file contents as one large string, second is contents splitted to lines, third is internal feed format, so on
Using the intermediate datatype (that is to say, reading the feed, transforming it into my datatype and then to the opml tree), reduces the problem: http://gorgias.mine.nu/haskell/a.out.feed2feedNotStrict.ps only 92 Mega of memory consumption for the very same operations.
Making the intermediate datatype strict gives almost the same results: http://gorgias.mine.nu/haskell/a.out.feed2feedStrict.ps 98 Mega.
using +RTS -c option and performGC after building opml tree and after closing feed should help you
Now, I come to believe the file reading is indeed strict, and that my problem could be related to StateT laziness.
Does this makes sense?
i'm not sure but guess that all monad transformers are strict. hope that someone else will clear this point
I'm now going to try to implement my opml state as a IORef and use a ReaderT monad to see if something new happens.
ps: if your program uses a lot if string, FPS will be a very great. it don;t change the GC behavior, just makes everything 10 times smaller :)
yes, but I'm using HXT and this is using normal strings to store xml text nodes. So I could have some improvements with IO but not that much in memory consumption, unless I totally change my implementation.
ask library authors :)
Anyway, even if I could reduce from 152 to 15 mega the memory consumption for reading 2 feeds, I'd be running out of memory, on my laptop, in one day instead that 5 minutes. Anyway I should face the fact that it is not the string implementation in Haskell that is causing the problem. The problem is probably me!
it is the world where we live :) yesterday i need to close Windows Task Manager because it was running for 2 weeks and its memory usage grown to 100 megs! :) but things is not so bad. with +RTS -c switch your program will reach some maximum memory usage and it will not grow further. or you can use alternatively +RTS -F2 switch - it will be faster that -c, require more memory and will not suffer from one GHC bug plus, running performGC at right points (right when you have a lot of garbage) should substantially decrease this maximum try it and please write me about results -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hallo Bulat! On Fri, Oct 20, 2006 at 10:21:51PM +0400, Bulat Ziganshin wrote:
first, GC don't occurs automatically when you close file. you can help GHC by using performGC from System.Mem. i does it in my own prog
I did not get an appreciable improvement with performGC, as you can see from here: http://gorgias.mine.nu/haskell/a.out.withPerformGC.ps But I found a solution: just write the opml state component to a file! At first I decided to write it and reload it: http://gorgias.mine.nu/haskell/a.out.withFileReload.ps But I get a better result by just writing it (to /dev/null or to a real file): http://gorgias.mine.nu/haskell/a.out.withFileWrite.ps This way everything is garbage collected and the graph is what I would expect. Now, this is a good solution since the opml file is generally small (I have a file with 100 subscribed feeds and it is just 46Kbyte) and it stores information on the folder layout, so it must be saved anyway.
second, each Char in ghc occupies 12 bytes (!), so each of your files occupies about 5 mb of memory. if you will count the previous problem, the 2 or 3 files can be hels in memory at the same time (just because they was not yet GCd) so memory usage may become, say, 10 mb
multiplying this at 2.5 or even 3 factor which i described in previous letter means, say, 30 mb used
30 Mega used for reading a feed is a number that I seem to get. I also prevent the user from opening more then one feed at once. When a new feed is opened, the old feed gets removed, the opml file is written, the old stuff is garbage collected and the new stuff is added. Now, the memory fingerprint of the program is given by the most memory consuming feed. It is running at about 60-65 Mega. Not that bad, compared to the previous situation. Thank you very much for your suggestions. Regards Andrea

Andrea Rossato wrote:
I did not get an appreciable improvement with performGC, as you can see from here: http://gorgias.mine.nu/haskell/a.out.withPerformGC.ps
But I found a solution: just write the opml state component to a file!
Obviously the values in question were not garbage, rather these were unevaluated thunks. Writing the data causes it to be evaluated, the thunks become garbage and get collected. The correct solution however, is the application of 'seq' at the right places. To understand where these are, perform a simulation of Haskell's reduction strategy on paper.
second, each Char in ghc occupies 12 bytes (!)
multiplying this at 2.5 or even 3 factor which i described in previous letter means, say, 30 mb used
30 Mega used for reading a feed is a number that I seem to get.
Depends on what you're doing with the data. If you scan a stream of Chars exactly once, the space requirement per Char is next to irrelevant. If you're keeping lots of Strings around, using PackedStrings will help (and be sure to pack strictly). But I actually suspect, you are running a backtracking parser over your input, so the whole input is read into a String and cannot be disposed of as long as the parser might backtrack. If this is Parsec, you need to remove a redundant 'try'. If it is the Read class, you need to replace it by Parsec or ReadP... Udo. -- <Lorien> Ich glaub vorher defragmentier ich meine Festplatte, schmeiß alle CDs weg und installier Löwenzahn, Teletubbies, Pokemon usw. auf meinem Rechner. Dann lauf ich amok. Das wird den Psychologen EINIGES zu denken geben.

Hallo! Thanks a lot for stopping by. On Sat, Oct 21, 2006 at 06:41:32PM +0200, Udo Stenzel wrote:
The correct solution however, is the application of 'seq' at the right places. To understand where these are, perform a simulation of Haskell's reduction strategy on paper.
I will definitely try that.
30 Mega used for reading a feed is a number that I seem to get.
Depends on what you're doing with the data. If you scan a stream of Chars exactly once, the space requirement per Char is next to irrelevant. If you're keeping lots of Strings around, using PackedStrings will help (and be sure to pack strictly). But I actually suspect, you are running a backtracking parser over your input, so the whole input is read into a String and cannot be disposed of as long as the parser might backtrack. If this is Parsec, you need to remove a redundant 'try'. If it is the Read class, you need to replace it by Parsec or ReadP...
No. I'm actually using the ReadDocument module of HXT for reading my input and writeDocument for writing. So, it's parsec, on your side...;-) The application I'm writing is basically a HXT application, with an hscurser gui. If you have some hints on how to make memory consumption go down that would be great. Thanks for your kind attention. Andrea

On Sat, Oct 21, 2006 at 09:09:13PM +0200, Andrea Rossato wrote:
So, it's parsec, on your side...;-)
I sorry, I was a bit confused when I wrote that. I confused you for another person, obviously. Sorry about that. Andrea

Hello Andrea, Saturday, October 21, 2006, 3:44:31 PM, you wrote:
first, GC don't occurs automatically when you close file. you can help GHC by using performGC from System.Mem. i does it in my own prog
But I found a solution: just write the opml state component to a file!
and just after that you should call performGC. as Udo already said, writing to file just forces all thunks to be evaluated. before writing, you have hundreds of thousands of unevaluted functions, so it's no wonder that so much memory used. there is a faster way to do it: return $! x==x as Udo said, it should be better to evaluate thunks just when they are created, by using proper 'seq' calls. another solution is to make your fields strict by using '!': data T = T !Int !S but this limited to types explicitly defined by you, you can't create lists of strict elements, for example: data T = T [!Int] -- impossible!
added. Now, the memory fingerprint of the program is given by the most memory consuming feed. It is running at about 60-65 Mega. Not that bad, compared to the previous situation.
during 2 years of developing my own program i've made memory requirements 20 or 30 times lower. this included writing my own packed strings lib :))) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hello! On Sun, Oct 22, 2006 at 12:27:05AM +0400, Bulat Ziganshin wrote:
as Udo said, it should be better to evaluate thunks just when they are created, by using proper 'seq' calls.
While I understand why you and Udo are right, still it is difficult for me to related this discussion to my code. So I wrote a small example that reproduces my problem, with the hope that this will help me understand your point. This is my specific problem, I believe. There is a StateT monad with a list of string as a state. The list is populated with the lines of a file entered by the user. The user may read some lines of this file or request another one: - lFilename will load a file - sNumber will show a line number. The input file is evaluated at the very beginning (in my case that is forced by the xml parser, as far as I understand) and stored as the state. Now, the state will not be entirely consumed/evaluated by the user, and so it will not become garbage. Am I right? Where should I force evaluation? Is it clear my confusion (sorry for this kind of nasty recursion...;-)? Thanks for your kind attention. Best regards, Andrea here's the code: -------------- module Main where import Control.Monad.State import IO data Mystate = Mystate {mystate :: [String]} type SL = StateT Mystate IO getState :: SL [String] getState = do s <- get return $ mystate s setState ns = modify (\s -> s {mystate = ns}) getFile :: String -> SL () getFile p = do f <- liftIO $ readFile p let lns = lines f -- forces evaluation of lns liftIO $ putStrLn $ "Number of lines: " ++ show (length lns) setState lns promptLoop showLine :: Int -> SL () showLine nr = do s <- getState liftIO $ putStrLn $ s !! nr promptLoop promptStr = "lFilename [load the file Filename] - sNr [show the line Nr of Filename] - q to quit" promptLoop :: SL () promptLoop = do liftIO $ putStrLn promptStr str <- liftIO getLine case str of ('l':ss) -> getFile ss ('s':nr) -> showLine (read nr) ('q':[]) -> liftIO $ return () _ -> promptLoop main = evalStateT promptLoop $ Mystate []

Hello Andrea, Sunday, October 22, 2006, 1:37:55 PM, you wrote:
as Udo said, it should be better to evaluate thunks just when they are created, by using proper 'seq' calls.
While I understand why you and Udo are right, still it is difficult for me to related this discussion to my code. So I wrote a small example that reproduces my problem, with the hope that this will help me understand your point.
sorry, i will not dig into details of its working. i will just explain common rule: if you calc something, make sure that returned value at any attempt to use it will be completely evaluated. it's something rather close to the problems of doing IO in lazy environment, thoroughly described in my IO inside manual for example, this definition: f a b = (a*b,a+b) will return you unevaluated thunk. even when it is used, it may be computed partially: main = do let x = f 1 2 print (fst x) here print will force evaluation of first value in pair, but not the second one. but you can improve it by telling 'if you need value of f, you should evaluate its components first': f a b = let x = a*b y = a+b in x `seq` y `seq` (x,y) this f definition will not evaluate x and y automatically. BUT its returned value is not (x,y). its returned value is x `seq` y `seq` (x,y) and when further computation try to use it in any way, it can't put hands on the pair before it will evaluate x and y values. are you understand? this technique therefore may be used to construct values that will be fully evaluated on any attempt to use any_ part of such value. say, f a = let x = .. y = .. z = .. in x `seq` y `seq` z `seq` T x y z g = let x = f 1 y = f 2 z = f 3 in x `seq` y `seq` z `seq` S x y z here value returned by g on any attempt to use it will force evaluation of two levels of data structure. this can be repeated again and again but writing to file or comparing does the same. that is difference. when you write value to file, you does all evaluations at once but before this moment you have large datastructure full of chunks built in memory. seq-ing technique can help when you build you structure step-by-step in some sequential environment (read monad) so, the following: main = do let x = g return $! (field1 x) will evaluate the whole value returned by g. but it will be equivalent to main = do let x = g'' return $! x==x where g'' - the same function without seqs to get real advantage, you need to build your value sequentially in monad and force evaluation of each step results: main = do let x = f 1 return $! x let y = f 2 return $! y let z = f 3 return $! z let a = T x y z .. so, the seq-ing technique will be better that you current one only in the case when values build are used to build other parts of these datastructure - in this case additional seq's may force evaluation of more part of datastructure than actually requested about your example - this monad is strict, i.e. its operations are sequenced like in IO monad itself (this monad just carries additional state information between IO operations). what you need to do - is to force that any updates/requests will force evaluation. then strictness of monad will guarantee you that evaluation will be perfomed at each step. this will work both for IO and your monad: do let x = a*b return $! x -- here x is evaluated the trick here is that where execution reaches the 'return $!' point, it needs to strictly evaluate value of 'return $! x' expression. this expression, that is equivalent to (x `seq` return x) need to evaluate x before it can return anything! as a result, 'return $! x' is executable statement (like putStr, for example) that force evaluation of x. i.e. it guarantees that x will be executed just at this moment, before executing next statement in 'do' block so, by inserting 'return $!' statements, you may force evaluation of thunks at given lines of your program, and of course you want to evaluate every thunk just at the moment it's created. now we need to look into your program and just find places where thunks are created:
data Mystate = Mystate {mystate :: [String]}
first rule of strict evaluation - use newtype instead of data for one-element constructors. newtype Mystate = Mystate {mystate :: [String]} is equivalent to data Mystate = Mystate {mystate :: ![String]} note '!' for making field strict. if you can't use newtype - make all fields strict next moment - [String] is a [[Char]], i.e. it's a lazy list of lazy lists. you should ensure that both levels of lists are strictly evaluated if you need to avoid thunks
type SL = StateT Mystate IO
getState :: SL [String] getState = do s <- get return $ mystate s
first place where we not strict enough. this expression will return _unevaluated_ thunk (mystate s). but you can force its evaluation before return:
return $! mystate s
so, your code return something like (mystate (Mystate ["a"])) while my code will evaluate this expression, throwing out function call, and return result of function evaluation: ["a"]. I should emphasize that it will not evaluate any thunks inside list of strings. but as long as your state is fully evaluated before storing, this operator will be able to return strictly evaluated value. it is how all this works
setState ns = modify (\s -> s {mystate = ns})
here you modify state, but don't ensure that string list is evaluated on both levels. well, it will be ok if you ensure evaluation at _each_ call to this function. alternatively, you can force evaluation before assignment by: setState ns = do return $! map length ns modify (\s -> s {mystate = ns})
getFile :: String -> SL () getFile p = do f <- liftIO $ readFile p let lns = lines f -- forces evaluation of lns liftIO $ putStrLn $ "Number of lines: " ++ show (length lns) setState lns promptLoop
here you evaluate only higher level of [String] before assigning it to state. but for this operation ('lines' on file contents) it's enough, we don't need to evaluate individual chars here
showLine :: Int -> SL () showLine nr = do s <- getState liftIO $ putStrLn $ s !! nr promptLoop
promptStr = "lFilename [load the file Filename] - sNr [show the line Nr of Filename] - q to quit"
promptLoop :: SL () promptLoop = do liftIO $ putStrLn promptStr str <- liftIO getLine case str of ('l':ss) -> getFile ss ('s':nr) -> showLine (read nr) ('q':[]) -> liftIO $ return ()
you can omit liftIO here, just return () will be ok :)
_ -> promptLoop
main = evalStateT promptLoop $ Mystate []
if StateT is strict monad transformer, this code don't have space leaks. you create thunks in two places, in one place you already evaluate it, and i wrote what to do in second place. for another code, you should watch assignments to state and ensure that assigned values are fully evaluated -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hello Bullat, first of all, thanks for your lengthy and clear explanation. On Sun, Oct 22, 2006 at 04:08:49PM +0400, Bulat Ziganshin wrote:
f a b = let x = a*b y = a+b in x `seq` y `seq` (x,y)
this f definition will not evaluate x and y automatically. BUT its returned value is not (x,y). its returned value is x `seq` y `seq` (x,y) and when further computation try to use it in any way, it can't put hands on the pair before it will evaluate x and y values. are you understand?
Yes, I do understand. But, as far as I know, "seq" will just evaluate x and y enough to see if they are not bottom. So, if x and y are a deep data structure, they won't be evaluated entirely, right? That is to say:
to get real advantage, you need to build your value sequentially in monad and force evaluation of each step results:
main = do let x = f 1 return $! x let y = f 2 return $! y let z = f 3 return $! z let a = T x y z ..
...
setState ns = modify (\s -> s {mystate = ns})
here you modify state, but don't ensure that string list is evaluated on both levels. well, it will be ok if you ensure evaluation at _each_ call to this function. alternatively, you can force evaluation before assignment by:
setState ns = do return $! map length ns modify (\s -> s {mystate = ns})
this is the crucial point. You are forcing evaluation with $! map length, I'm doing it with writeFile. I do not see very much difference. That's an ad hoc solution. Since I need to write the state, instead of (needlessly) looking for the length of it's members, I write it...;-) By the way, the problem is not ns, but s, the old state. The new state has been evaluated by the code below (when we display the number of lines). So you need to change your code with: do s <- getState modify (\s -> s {mystate = ns}) otherwise you are going to have the same space leak as my original code had. In other word, it is possible to have a user who keeps on loading files without looking at any line (in this case getState is never called and so there is no "return $! mystate s"!). This produces the same as: setState ns = do s <- getState -- that does: return $ mystate s liftIO $ writeFile "/dev/null" s modify (\s -> s {mystate = ns})
if StateT is strict monad transformer, this code don't have space leaks. you create thunks in two places, in one place you already evaluate it, and i wrote what to do in second place.
Yes, indeed. But just because we forced evaluation in a way that `seq` cannot do. Am I right? Thanks a lot for you patience. You did help me a lot. And this is not the first time. I appreciate. Really. Best regards, Andrea

Hello Andrea, Sunday, October 22, 2006, 6:06:24 PM, you wrote:
f a b = let x = a*b y = a+b in x `seq` y `seq` (x,y)
this f definition will not evaluate x and y automatically. BUT its returned value is not (x,y). its returned value is x `seq` y `seq` (x,y) and when further computation try to use it in any way, it can't put hands on the pair before it will evaluate x and y values. are you understand?
Yes, I do understand. But, as far as I know, "seq" will just evaluate x and y enough to see if they are not bottom. So, if x and y are a deep data structure, they won't be evaluated entirely, right?
x (and y) should be not considered here as data value! it's an expression which should be evaluated to bottom or non-bottom, i.e. some data constructor. but what if expression of x (i.e. thunk) by itself contains 'seq' call? say let x = 1/0 `seq` 1 y = x `seq` 2 print y try it ;) it will require to evaluate y value that requires to evaluate x that requires to eval 1/0! so when we construct some value we may ensure that all its parts will be evaluated once some part will be requested: x `seq` y `seq` T x y when you construct data values with seq recursively, this will lead to that whole deep datastructure will be evaluated once any part of it will be requested i don't propose you to do exactly this, just explain how this works and why works
That is to say:
to get real advantage, you need to build your value sequentially in monad and force evaluation of each step results:
main = do let x = f 1 return $! x let y = f 2 return $! y let z = f 3 return $! z let a = T x y z ..
...
setState ns = modify (\s -> s {mystate = ns})
here you modify state, but don't ensure that string list is evaluated on both levels. well, it will be ok if you ensure evaluation at _each_ call to this function. alternatively, you can force evaluation before assignment by:
setState ns = do return $! map length ns modify (\s -> s {mystate = ns})
this is the crucial point. You are forcing evaluation with $! map length, I'm doing it with writeFile. I do not see very much difference. That's an ad hoc solution. Since I need to write the state, instead of (needlessly) looking for the length of it's members, I write it...;-)
yes, it's the same. _right_ way is to use speciall deepSeq function that is like 'seq' but does full evaluation. unfortunately, haskell/ghc don't give us ready-to-use implementation. equality testing and other tricks are just the ways to make full evaluation of this datastructure and they are unnecessary if you really consume it all
By the way, the problem is not ns, but s, the old state. The new state has been evaluated by the code below (when we display the number of lines). So you need to change your code with:
do s <- getState modify (\s -> s {mystate = ns})
otherwise you are going to have the same space leak as my original code had. In other word, it is possible to have a user who keeps on loading files without looking at any line (in this case getState is never called and so there is no "return $! mystate s"!).
this '$!' only suppress one small thunk that contains mystate call on MyState: mystate (MyState s) => s. it don't evaluates anything else, so adding s<-getState will not change anything. if you want to do full evaluatin, you should add the same 'return $! map length state' to the getState, but instead you can add this just to setState :))) so, you don't need to use 'map length' in getState. you don't need to call getState in setState. and you may omit 'map length' call in setState if you will ensure that all calls to it fully evaluate this list. for beginning, in you non-toy code, you may add this call and see that will happen next, this line
modify (\s -> s {mystate = ns})
really produce unevaluated thunk. monad state will point to expression 's {mystate = ns}'. it's better to represent in functional style: setMyState s ns. as you can see, this thunk points to the old value of s. while you (and me) know that this value is really unused, RTS (GHC run-time system) don't know it! as a result, all old file contents are hold in memory until you evaluate this thunk :))) instead of 'modify' you should use the following sequence: s <- get put $! s {mystate = ns} it's my fault - i don't seen this memory leak until we digged into each step of evaluation :) you can run performGC at this moment to ensure that memory will never contain more than two file's contents. if you want to hold no more that one contents, use the following: setState [] performGC s <- getContents setState (lines s) although, i also suggest to add performGC after each substantial step (reading, conversion) and then try to remove one after one and see results
This produces the same as: setState ns = do s <- getState -- that does: return $ mystate s liftIO $ writeFile "/dev/null" s modify (\s -> s {mystate = ns})
as i just understood, the old 's' value is held in memory even after modify, so evaluating it really allow to reduce memory usage :) of course, using put$! should be even better
if StateT is strict monad transformer, this code don't have space leaks. you create thunks in two places, in one place you already evaluate it, and i wrote what to do in second place.
Yes, indeed. But just because we forced evaluation in a way that `seq` cannot do. Am I right?
yes, the strict evaluation order of this monad gives us possibility to evaluate thunks just at some steps of execution instead of linking evaluation to requests of data inside of some structure. strict monad gives us strict context on which we can rely
Thanks a lot for you patience. You did help me a lot. And this is not the first time. I appreciate. Really.
i like to teach :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Andrea Rossato wrote:
Now, the state will not be entirely consumed/evaluated by the user, and so it will not become garbage. Am I right?
No. The state cannot become garbage, because there is still a reference to it. As long as runStateT has not returned, any part of the state can still be accessed, so it is not garbage. Completely evaluating the state will not reduce memory consumption in your case, because the list of lists won't be substantially smaller that the thunk to create it. In fact, evaluating this thunk will consume memory.
Where should I force evaluation?
You can't. Your state really is that large, at least in the toy example. You'd need a different data structure ((Array Int ByteString) or (Map Int ByteString) come to mind) and then make that strict. Udo. -- The Seventh Commandments for Technicians: Work thou not on energized equipment, for if thou dost, thy fellow workers will surely buy beers for thy widow and console her in other ways.

Hello Udo, Sunday, October 22, 2006, 6:41:24 PM, you wrote:
Now, the state will not be entirely consumed/evaluated by the user, and so it will not become garbage. Am I right?
No. The state cannot become garbage, because there is still a reference to it. As long as runStateT has not returned, any part of the state can still be accessed, so it is not garbage.
this monad state is not something fixed, it's completely replaced by each setState call (and old state becomes garbage from this moment). in particular, with this toy example, after reading each next file, old file contents continue its life as garbage
Completely evaluating the state will not reduce memory consumption in your case, because the list of lists won't be substantially smaller that the thunk to create it. In fact, evaluating this thunk will consume memory.
while this is true for this particular data (i.e., getContents >>= lines) in general this is not true. f.e. [[toUpper 'a']] list will need more memory in unevaluated form
Where should I force evaluation?
You can't. Your state really is that large, at least in the toy example. You'd need a different data structure ((Array Int ByteString) or (Map Int ByteString) come to mind) and then make that strict.
[ByteString] will be enough. Andrea, may be you can convert strings to bytestrings after hxt conversion? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (4)
-
Andrea Rossato
-
Bulat Ziganshin
-
Jason Dagit
-
Udo Stenzel