
I'll make a guess... Ian Lynagh wrote:
Hi all,
In the middle (readChunks) is the equivalent of gunzip. It repeatedly calls foo until there is no more input left.
At the top is a simple main function that calls them.
If I do
dd if=/dev/zero of=data bs=1000 count=3000" # making data around 3MB ghc --make Test1 -o Test1 -O -Wall ./Test1
then in top I see Test1 increasing memory usage to around 150MB. I think this is because the "let (ys, zs) = foo xs" means zs holds on to xs (it's hard to be sure as compiling for profiling is too happy to change the behaviour).
I don't have a comment on your guess.
I tried (Test2) changing foo to be a monad transformer over the calling monad, so the caller's remaining input was updated as we went along, but (as well as memory usage not obviously being fixed) this is giving me a stack overflow.
I will ignore Test2
Has anyone got any suggestions for making a constant space, constant stack version?
Not yet.
Thanks Ian
------------------------------------------------------------------------
module Main (main) where
import Control.Monad (liftM) import Control.Monad.State (State, runState, evalState, get, put)
main :: IO () main = do xs <- readFile "data" ys <- readFile "data" print (evalState readChunks xs == ys)
The equality should be constant space.
---
type FirstMonad = State String
readChunks :: FirstMonad String readChunks = do xs <- get if null xs then return [] else do let (ys, zs) = foo xs put zs
And zs is the final state of "runState bar" which is suspect is [] And ys is the whole input (which is now all in memory)
rest <- readChunks return (ys ++ rest)
---
type SecondMonad = State String
foo :: String -> (String, String) foo = runState bar
bar :: SecondMonad String bar = do inp <- get case inp of [] -> return [] x:xs -> do put xs liftM (x:) bar
The liftM should be equivalent to temp <- bar return ( (x:) temp ) It looks like the first call to foo will have bar consuming the entire input string. So the flow looks like main readChuncks all-input foo all-input bar (iterated over whole input length) foo returns (all-input, []) "rest <- readChunks" (recursive call, sees null xs then "return []") "return (ys ++ rest)" which is return (all-input ++ []) In essence, your bar traverses the whole string until the state is empty. This loads your whole file into memory