Question on monads and laziness

Hello I am studying Real World Haskell chapter 9. Here is a snippet of code data Info = Info { infoPath :: FilePath , infoPerms :: Maybe Permissions , infoSize :: Maybe Integer , infoModTime :: Maybe ClockTime } deriving (Eq, Ord, Show) getInfo :: FilePath -> IO Info -- file: ch09/ControlledVisit.hs traverse order path = do names <- getUsefulContents path contents <- mapM getInfo (path : map (path >) names) liftM concat $ forM (order contents) $ \info -> do if isDirectory info && infoPath info /= path then traverse order (infoPath info) else return [info] getUsefulContents :: FilePath -> IO [String] getUsefulContents path = do names <- getDirectoryContents path return (filter (`notElem` [".", ".."]) names) isDirectory :: Info -> Bool isDirectory = maybe False searchable . infoPerms When I read about IO in the previous chapter, I learnt that reading a file can be done lazily. Here my doubt is that whether the expression "order contents" would generate a list of directory contents (held in memory) for use in forM construct. Because in a subsequent para, I find this line. "If we are traversing a directory containing 100,000 files of which we care about three, we'll allocate a 100,000-element list before we have a chance to trim it down to the three we really want" Wouldn't laziness ensure that in the traverse function, when iterating over directory contents using the list generated by "order contents", it will generate just one element at a time and then free the memory for that entry immediately since we are not using the referencing list anymore in the rest of the function. Not sure whether I have understood the concept of laziness w.r.t monads correctly. Please clarify my doubts with regard to the code snippet. Thanks for your time. -- Regards Lakshmi Narasimhan T V

At 9:23 AM +0530 7/26/09, Lakshmi Narasimhan Vaikuntam wrote:
Hello I am studying Real World Haskell chapter 9. Here is a snippet of code
data Info = Info { infoPath :: FilePath , infoPerms :: Maybe Permissions
, infoSize :: Maybe Integer , infoModTime :: Maybe ClockTime } deriving (Eq, Ord, Show)
getInfo :: FilePath -> IO Info -- file: ch09/ControlledVisit.hs
traverse order path = do names <- getUsefulContents path contents <- mapM getInfo (path : map (path >) names) liftM concat $ forM (order contents) $ \info -> do if isDirectory info && infoPath info /= path
then traverse order (infoPath info) else return [info]
getUsefulContents :: FilePath -> IO [String] getUsefulContents path = do names <- getDirectoryContents path return (filter (`notElem` [".", ".."]) names)
isDirectory :: Info -> Bool isDirectory = maybe False searchable . infoPerms When I read about IO in the previous chapter, I learnt that reading a file can be done lazily. Here my doubt is that whether the expression "order contents" would generate a list of directory contents (held in memory) for use in forM construct. Because in a subsequent para, I find this line.
"If we are traversing a directory containing 100,000 files of which we care about three, we'll allocate a 100,000-element list before we have a chance to trim it down to the three we really want"
Wouldn't laziness ensure that in the traverse function, when iterating over directory contents using the list generated by "order contents", it will generate just one element at a time and then free the memory for that entry immediately since we are not using the referencing list
anymore in the rest of the function.
Not sure whether I have understood the concept of laziness w.r.t monads correctly. Please clarify my doubts with regard to the code snippet.
Thanks for your time. -- Regards Lakshmi Narasimhan T V
These issues are a bit subtle. Yes, a file can be read lazily. What this means is that there are I/O actions (readFile and getContents, for example) which produce lazy strings, for which the underlying read operations are done as needed as the successive characters of the strings are demanded. But this kind of "lazy I/O" is not at play here. I/O is a special monad. Its semantics are that I/O actions are performed in sequence, regardless of demands (or lack thereof) on the actions' results. (This is so that real-world side effects occur in determinstic order.) And the I/O actions themselves determine how much computation they do before offering their results. In the code you cited, `traverse` calls `getUsefulContents` to generate a list of names. `getUsefulContents` uses `getDirectoryContents`, which computes its result list completely before returning it. (The documentation doesn't say that explicitly. It is implied because it doesn't say that the result is returned lazily.) So both `names` and `contents` could be very large, as the authors note. Some describe these semantics by saying "the I/O monad is strict", but I don't think that's a precise statement, because strictness is a property of a function (and its arguments), not of a monad (at least as far as I understand). I hope this helps. Dean
participants (2)
-
Dean Herington
-
Lakshmi Narasimhan Vaikuntam