
I'm relatively new to haskell. Let's say I have a simple program like this: import Data.Map type Tree = String type StringMap = Map String String f :: StringMap -> Tree -> Tree f _ y = y ++ "!" getStringMap :: IO StringMap getStringMap = return (Data.Map.fromList [("k1","v1")]) main :: IO () main = do m <- getStringMap s <- getLine putStr $ show (f m s) ++ "\n" Ignore the implementation of f and getStringMap for now. Assume f is a fairly complex function that uses the Map during the process of transforming the Tree argument into the result Tree (here I defined Tree as a string just so that it would compile). getStringMap reads the whole Map into memory at once. However, now let's say that the whole Map is too large, so I decide to store it over a network in a remote database system instead. And then I define another function like: type Connection = String getStringMapValue :: Connection -> String -> IO String that takes a connection string and a key and returns the value. My question is this: how do I change my program to make StringMap a lazily-loaded structure based on getStringMapValue without changing f? Regards, Jeff Davis

My question is this: how do I change my program to make StringMap a lazily-loaded structure based on getStringMapValue without changing f?
You can't, because 'f' has a pure signature, but you want it to run network operations, which are not pure. You could say it's logically pure if you guarantee that no one else is modifying the DB while your program is running, and you are willing to completely abort on a network error, in which case there's a hack called unsafeInterleaveIO. However, I'm guessing no one is going to recommend actually doing that. The Prelude getContents function does that and it gets a lot of flak for poor error handling, not closing the handle when you want it to, etc. Talking over the network will have all those same problems. If you want to operate over a large structure and hide the fetching part, you can look into the iteratee stuff. Basically you would define a function that opens a socket, passes chunks of data to a passed in pure iteratee function, and closes the socket afterwards. If you have a random access Map then I can't think of anything more elegant than the standard imperative "lookup :: Key -> RemoteMap -> IO Val" possibly with caching. Put the reading strategy in an IO function, and the rest of the processing in passed-in pure functions.

On Fri, 2010-01-01 at 15:07 -0800, Evan Laforge wrote:
You could say it's logically pure if you guarantee that no one else is modifying the DB while your program is running, and you are willing to completely abort on a network error, in which case there's a hack called unsafeInterleaveIO.
What do you mean by "completely abort"? Couldn't it simply raise an exception that could be handled in main?
However, I'm guessing no one is going to recommend actually doing that. The Prelude getContents function does that and it gets a lot of flak for poor error handling, not closing the handle when you want it to, etc. Talking over the network will have all those same problems.
Yes, when I saw the definition of getContents, I suspected problems were possible.
If you want to operate over a large structure and hide the fetching part, you can look into the iteratee stuff. Basically you would define a function that opens a socket, passes chunks of data to a passed in pure iteratee function, and closes the socket afterwards. If you have a random access Map then I can't think of anything more elegant than the standard imperative "lookup :: Key -> RemoteMap -> IO Val" possibly with caching. Put the reading strategy in an IO function, and the rest of the processing in passed-in pure functions.
Interesting. I wonder if it might be useful to approach it like a virtual memory system. Searching for a value that hasn't been loaded into the structure would raise an exception, which would then be caught, handled by using normal (safe) IO, and then the computation could be retried. Is that idea crazy, or does it have some merit? Regards, Jeff Davis

On Fri, Jan 1, 2010 at 6:24 PM, Jeff Davis
I wonder if it might be useful to approach it like a virtual memory system. Searching for a value that hasn't been loaded into the structure would raise an exception, which would then be caught, handled by using normal (safe) IO, and then the computation could be retried.
Is that idea crazy, or does it have some merit?
No, pretty much crazy. Exceptions are just as impure as any other effect. To capture network computations, abstract them out, into eg. a type "Network a" which represents values which depend on network data. It would naturally be monadic, but you can restrict the operations so that Network can only do specific things - eg. reading, not writing, not overwritng your hard drive, not throwing exceptions (or throwing them if you want it to), etc. This is the typical method of abstracting over IO in Haskell. As a sort of dual to the OO way, where you start with simple operations and add features, we start with "anything" (the IO monad) and take away features until we have something we can reason about. Luke
participants (3)
-
Evan Laforge
-
Jeff Davis
-
Luke Palmer