Dealing with configuration data

Evening, I'm trying to write a utility that reads in some user preferences from a pre-determined file, does some work, and exits. Sounds simple enough. The problem I'm having is with the preferences: How do I make it available throughout the entire program? (FWIW, most of the work is effectively done inside the IO monad.) I could explicitly pass the record around everywhere, but that seems a trifle inelegant. My current solution is to use a global ('scuse my terminology, I'm not sure that's the right word to use here) variable of type IORef Config obtained through unsafePerformIO. It works, but strikes me as a rather barbaric solution to a seemingly tame enough problem... Intuition tells me I should be able to `embed', if you will, the config record somehow within or alongside the IO state, and retrieve it at will. (Is this what MonadState is for?) However it also tells me that this will /probably/ involve lots of needless lifting and rewriting of the existing code, which makes it even less enticing than passing everything around explicitly. Any opinions or suggestions? Cheers, /Liyang -- .--{ Liyang HU }--{ http://nerv.cx/ }--{ Caius@Cam }--{ ICQ: 39391385 }--. | :: zettai unmei mokusiroku :::: absolute destined apocalypse ::::::::: |

AFAIK, the global variable (so-called), passing around, and lifting the IO monad are your only options. I almost always use the global variable method since I know that in this case the unsafePerformIO is actually safe, since writing to the variable will always occur before the call to upIO and that it will only be written once. I don't feel bad about doing this because GHC does this itself for its own configuration :). -- Hal Daume III "Computer science is no more about computers | hdaume@isi.edu than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume On Thu, 26 Sep 2002, Liyang Hu wrote:
Evening,
I'm trying to write a utility that reads in some user preferences from a pre-determined file, does some work, and exits. Sounds simple enough.
The problem I'm having is with the preferences: How do I make it available throughout the entire program? (FWIW, most of the work is effectively done inside the IO monad.) I could explicitly pass the record around everywhere, but that seems a trifle inelegant.
My current solution is to use a global ('scuse my terminology, I'm not sure that's the right word to use here) variable of type IORef Config obtained through unsafePerformIO. It works, but strikes me as a rather barbaric solution to a seemingly tame enough problem...
Intuition tells me I should be able to `embed', if you will, the config record somehow within or alongside the IO state, and retrieve it at will. (Is this what MonadState is for?) However it also tells me that this will /probably/ involve lots of needless lifting and rewriting of the existing code, which makes it even less enticing than passing everything around explicitly.
Any opinions or suggestions?
Cheers, /Liyang -- .--{ Liyang HU }--{ http://nerv.cx/ }--{ Caius@Cam }--{ ICQ: 39391385 }--. | :: zettai unmei mokusiroku :::: absolute destined apocalypse ::::::::: |

Sorry, I should also mention implicit parameters, if you're willing to use that extension. I don't like them, though, and my impression from SPJ is that it's very unclear whether they will get into Haskell 2 or not... -- Hal Daume III "Computer science is no more about computers | hdaume@isi.edu than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume On Wed, 25 Sep 2002, Hal Daume III wrote:
AFAIK, the global variable (so-called), passing around, and lifting the IO monad are your only options. I almost always use the global variable method since I know that in this case the unsafePerformIO is actually safe, since writing to the variable will always occur before the call to upIO and that it will only be written once. I don't feel bad about doing this because GHC does this itself for its own configuration :).
-- Hal Daume III
"Computer science is no more about computers | hdaume@isi.edu than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume
On Thu, 26 Sep 2002, Liyang Hu wrote:
Evening,
I'm trying to write a utility that reads in some user preferences from a pre-determined file, does some work, and exits. Sounds simple enough.
The problem I'm having is with the preferences: How do I make it available throughout the entire program? (FWIW, most of the work is effectively done inside the IO monad.) I could explicitly pass the record around everywhere, but that seems a trifle inelegant.
My current solution is to use a global ('scuse my terminology, I'm not sure that's the right word to use here) variable of type IORef Config obtained through unsafePerformIO. It works, but strikes me as a rather barbaric solution to a seemingly tame enough problem...
Intuition tells me I should be able to `embed', if you will, the config record somehow within or alongside the IO state, and retrieve it at will. (Is this what MonadState is for?) However it also tells me that this will /probably/ involve lots of needless lifting and rewriting of the existing code, which makes it even less enticing than passing everything around explicitly.
Any opinions or suggestions?
Cheers, /Liyang -- .--{ Liyang HU }--{ http://nerv.cx/ }--{ Caius@Cam }--{ ICQ: 39391385 }--. | :: zettai unmei mokusiroku :::: absolute destined apocalypse ::::::::: |
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, 25 Sep 2002 16:06:29 -0700 (PDT)
Hal Daume III
I don't feel bad about doing this because GHC does this itself for its own configuration :).
I am going to show you that using unsafePerformIO where there really are side effects leads to unpredictable results, and is generally wrong in a lazy language. Don't hate me for this :) Consider this example (supposing that a Config is represented by an Int): storeConfig :: Int -> () readConfig :: Int They both are obtained through the use of "unsafePerformIO". Now, say I got this code: (storeConfig 0,storeConfig 1,readConfig,storeConfig 0,readConfig) What is this 5-uple supposed to evaluate to? First of all, this depends on order of evaluation. We can't say that all the elements of the tuple will be evaluated, so we can't tell if the fifth readConfig will evaluate to 0 or 1 (if the third storeConfig is never evaluated, readConfig will evaluate to 0, else to 1) This is one of the causes of the use of monads: ensuring correct order of evaluation. Second, suppose we were able to force order of evaluation (which shouldn't be allowed, in a lazy language). We still can't say what the last "readConfig" would evaluate to, since we don't know if the compiler is substituting equals for equals (I am expecting a lazy functional language to do this). If the compiler does, the last readConfig is equal to the first (in fact, by the use of unsafePerformIO, you have told the compiler that both the functions storeConfig and readConfig are pure, which is not true) and will evaluate to 1, else it will evaluate to 0. And, besides, the compiler should also substitute the second "storeConfig 0" with the result of the first occurrence, so it would not evaluate the second "storeConfig" at all. This is another example of the need for monads: allowing program transformations, first of all substituting equals for equals. This is why (even if, by enough knoweledge of the implementation, we could), by only relying on the semantics of a lazy language, we can not have functions with side effects. If it wasn't so, they would not have invented monads, believe me. I apologize, as always, for my terrible english, and hope I have been clear. Vincenzo Ciancia

Evening,
I'm trying to write a utility that reads in some user preferences from a pre-determined file, does some work, and exits. Sounds simple enough.
The problem I'm having is with the preferences: How do I make it available throughout the entire program? (FWIW, most of the work is effectively done inside the IO monad.) I could explicitly pass the record around everywhere, but that seems a trifle inelegant.
My current solution is to use a global ('scuse my terminology, I'm not sure that's the right word to use here) variable of type IORef Config obtained through unsafePerformIO. It works, but strikes me as a rather barbaric solution to a seemingly tame enough problem...
Intuition tells me I should be able to `embed', if you will, the config record somehow within or alongside the IO state, and retrieve it at will. (Is this what MonadState is for?) However it also tells me that this will /probably/ involve lots of needless lifting and rewriting of the existing code, which makes it even less enticing than passing everything around explicitly.
This is how I usually do it: http://www.mail-archive.com/haskell@haskell.org/msg10565.html (ignore the last part of the post...) J.A.

I don't mean to troll, but this isn't what I meant. Suppose we have: data Configuration = ... -- config data globalConfig :: IORef Configuration globalConfig = unsafePerformIO (newIORef undefined) Now, we define an unsafe function to read the configuration: getConfig :: Configuration getConfig = unsafePerformIO $ readIORef globalConfig Okay, this is "bad" but I claim it's okay, iff it is used as in: main = do ...read configuration from file...no calls to getConfig... writeIORef globalConfig configuration doStuff return () now, we have doStuff :: IO a. doStuff is allowed (even in its pure methods) to use getConfig. I claim that this is safe. I could be wrong; this is only a hand-waiving argument. Why? The first reference in the program to globalConfig is through a writeIORef. This means that at this point globalConfig gets evaluated and thus a ref is created. Immediately we put a value in it. Now, when doStuff runs, since it is an action run *after* the call to writeIORef, provided that it doesn't also write to 'globalConfig' (which I mentioned in my original message), any call to getConfig is deterministic. I could be wrong...please correct me if I am. -- Hal Daume III "Computer science is no more about computers | hdaume@isi.edu than astronomy is about telescopes." -Dijkstra | www.isi.edu/~hdaume On Thu, 26 Sep 2002, Nick Name wrote:
On Wed, 25 Sep 2002 16:06:29 -0700 (PDT) Hal Daume III
wrote: I don't feel bad about doing this because GHC does this itself for its own configuration :).
I am going to show you that using unsafePerformIO where there really are side effects leads to unpredictable results, and is generally wrong in a lazy language. Don't hate me for this :)
Consider this example (supposing that a Config is represented by an Int):
storeConfig :: Int -> () readConfig :: Int
They both are obtained through the use of "unsafePerformIO".
Now, say I got this code:
(storeConfig 0,storeConfig 1,readConfig,storeConfig 0,readConfig)
What is this 5-uple supposed to evaluate to?
First of all, this depends on order of evaluation. We can't say that all the elements of the tuple will be evaluated, so we can't tell if the fifth readConfig will evaluate to 0 or 1 (if the third storeConfig is never evaluated, readConfig will evaluate to 0, else to 1) This is one of the causes of the use of monads: ensuring correct order of evaluation.
Second, suppose we were able to force order of evaluation (which shouldn't be allowed, in a lazy language). We still can't say what the last "readConfig" would evaluate to, since we don't know if the compiler is substituting equals for equals (I am expecting a lazy functional language to do this).
If the compiler does, the last readConfig is equal to the first (in fact, by the use of unsafePerformIO, you have told the compiler that both the functions storeConfig and readConfig are pure, which is not true) and will evaluate to 1, else it will evaluate to 0. And, besides, the compiler should also substitute the second "storeConfig 0" with the result of the first occurrence, so it would not evaluate the second "storeConfig" at all.
This is another example of the need for monads: allowing program transformations, first of all substituting equals for equals.
This is why (even if, by enough knoweledge of the implementation, we could), by only relying on the semantics of a lazy language, we can not have functions with side effects.
If it wasn't so, they would not have invented monads, believe me.
I apologize, as always, for my terrible english, and hope I have been clear.
Vincenzo Ciancia _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

G'day all. On Thu, Sep 26, 2002 at 12:06:36AM +0100, Liyang Hu wrote:
The problem I'm having is with the preferences: How do I make it available throughout the entire program? (FWIW, most of the work is effectively done inside the IO monad.) I could explicitly pass the record around everywhere, but that seems a trifle inelegant.
My current solution is to use a global ('scuse my terminology, I'm not sure that's the right word to use here) variable of type IORef Config obtained through unsafePerformIO. It works, but strikes me as a rather barbaric solution to a seemingly tame enough problem...
One solution is to do precisely as you suggested, using a state monad to wrap the IORef. For example: import Control.Monad.Reader import Data.IORef type MyIO a = ReaderT (IORef Config) IO a main = do config <- readConfigurationStuff configref <- newIORef config runReaderT configref main' getConfig :: MyIO Config getConfig = do configref <- ask liftIO (readIORef configref) -- Same as above, but you can supply a projection function. getsConfig :: (Config -> a) -> MyIO a getsConfig f = do config <- getConfig return (f config) -- ...and this is where the code REALLY starts. main' :: MyIO () main' = do config <- getConfig liftIO (putStrLn (show config)) -- etc You can wrap whole slabs of existing code in liftIO if it uses IO but does not need to read the configuration. There's also a much uglier solution which I occasionally use if I need an "ad hoc" global variable. Rather than using IORefs, I use Strings as keys. The code is here: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hfl/hfl/ioext/ Example of use: import IOGlobal main :: IO () main = do writeIOGlobalM "foo" "Foo data" writeIOGlobalM "bar" ("Bar", ["data"]) foo <- readIOGlobalM "foo" putStrLn foo bar <- readIOGlobalM "bar" putStrLn (show (bar :: (String, [String]))) Cheers, Andrew Bromage

Hal Daume III suggested: | data Configuration = ... -- config data | | globalConfig :: IORef Configuration | globalConfig = unsafePerformIO (newIORef undefined) : | getConfig :: Configuration | getConfig = unsafePerformIO $ readIORef globalConfig : | main = do | ...read configuration from file...no calls to getConfig... | writeIORef globalConfig configuration | doStuff | return () In this case, there is no need to use unsafePerformIO more than once, nor does one need IORefs. Here is how: data Configuration = ... -- config data getConfig :: Configuration getConfig = unsafePerformIO $ do ...read configuration from file... return configuration main = do doStuff We know getConfig will only be evaluated once (because of sharing) (*) Don't use the dirty stuff when you do not have to! :-) I think GHC even supports a function getArgs which is not in the IO monad since the arguments to a program do not change during a program. If getConfig only depends on the arguments, no unsafePerformIO is necessary at all. Gofer, and even early versions of Hugs, had a function openFile :: FilePath -> String. Thge rationale was (I guess) that the contents of a file would not change during the evaluation of a Gofer program. Regards, /Koen. (*) Actually, a Haskell compiler is free to inline these kind of expressions, so really one has to give a NOINLINE pragma to the compiler as well. -- Koen Claessen http://www.cs.chalmers.se/~koen Chalmers University, Gothenburg, Sweden.

Dear all, At the moment, a discussion on haskell-cafe is going on about how to neatly program the fact that an entire program depends on a number of parameters that are read in once at the beginning of a program. The suggestion that many people came up with was using unsafePerformIO in the beginning to read the file. Here is my version of that: | data Configuration = ... -- config data | | getConfig :: Configuration | getConfig = unsafePerformIO $ | do ...read configuration from file... | return configuration | | main = | do doStuff It is quite disturbing that there is no other easy way to do this than using unsafePerformIO (except for using implicit parameters perhaps, but there are other reasons for not using those). I have been thinking a little bit more about this and here is what I found. Remember the Gofer days, when Gofer had a "function": openFile :: FilePath -> String This was of course a cheap and dirty way of implementing things like the getConfig above, but it is impure. However, one could imagine a functional version of this function: readFileOnce :: FilePath -> Maybe String This function will read the contents of the file (and return Nothing if something went wrong), but it is memoized, so that the second time you use this function you get the same result. So, it is a pure function. (Admittedly, it is somewhat unpredictable, but you will always get the same result for the same arguments.) It is no more strange than GHC's pure version of the getArgs function (I forgot what it was/is called). How about space behavior, you say? Reading a file, and memoizing the result means storing the whole contents of the file in memory! The point is that the use of this function will typically happen at the beginning of a program, when reading the configuration file(s). When all this has happened, the function readFileOnce, and its memo table, will be garabage collected. (Of course there is no guarantee that all calls to readFileOnce will be evaluated at the beginning of a program, and it is not required, but when you do, there are no space problems.) There could of course be pure "-Once" versions of other IO operations. Here is a list of possibilities: - reading a file - getting arguments - getting environment variables - downloading a webpage - ... What do you think? Regards, /Koen. -- Koen Claessen http://www.cs.chalmers.se/~koen Chalmers University, Gothenburg, Sweden.

I just wrote a long and clear answer, but my e-mail client has crashed.
I am going to change it (or to rewrite one in Haskell, grrr) but the
answer will be shorter, I apologize.
On Wed, 25 Sep 2002 16:34:02 -0700 (PDT)
Hal Daume III
I don't mean to troll, but this isn't what I meant.
You aren't. I misunderstood you, of course.
now, we have doStuff :: IO a. doStuff is allowed (even in its pure methods) to use getConfig. I claim that this is safe. I could be wrong; this is only a hand-waiving argument. Why?
The first reference in the program to globalConfig is through a writeIORef. This means that at this point globalConfig gets evaluated and thus a ref is created. Immediately we put a value in it.
Now, when doStuff runs, since it is an action run *after* the call to writeIORef, provided that it doesn't also write to 'globalConfig' (which I mentioned in my original message), any call to getConfig is deterministic.
Even this appears correct, and I feel that it's a technique widely used in the haskell community, nothing prevents a global optimizer to evaluate all pure functions wich do not depend on a value obtained by IO *before the "main" function*. This is also stated in the GHC documentation, about "unsafePerformIO": "If the I/O computation wrapped in unsafePerformIO performs side effects, then the relative order in which those side effects take place (relative to the main I/O trunk, or other calls to unsafePerformIO) is indeterminate." If getConfig is evaluated before the main function (nothing prevents it) and if all equal pure expressions are evaluated only once, readConfig will always lead to "undefined" (yes, it will always be evaluated after "globalConfig"). The first idea from Koen Claessen (getConfig operates directly on the file with unsafePerformIO) appears to work, but this time we are *relying* on the fact that the function will be evaluated once, since the file could change and multiple evaluations of readConfig wouldn't lead to the same result. This is not good anyway. Vincenzo

Nick Name wrote: | The first idea from Koen Claessen (getConfig operates | directly on the file with unsafePerformIO) appears to | work, but this time we are *relying* on the fact that | the function will be evaluated once, since the file | could change and multiple evaluations of readConfig | wouldn't lead to the same result. This is not good | anyway. But Hal Daume III's suggestion has that same problem: | data Configuration = ... -- config data | | globalConfig :: IORef Configuration | globalConfig = unsafePerformIO (newIORef undefined) : | getConfig :: Configuration | getConfig = unsafePerformIO $ readIORef globalConfig : | main = do | ...read configuration from file...no calls to getConfig... | writeIORef globalConfig configuration | doStuff | return () Imagine "globalConfig" being evaluated twice! This means there will be *two* IORefs in your program; one might be initialized, and not the other. This is even more disastrous. (To see what could happen: just inline the definition of globalConfig into the two places where it is used.) This is why one has to be EXTREMELY careful when using these kinds of constructs. Really, only use unsafePerformIO when you know what you are doing, and otherwise, leave it to someone else who can wrap it up into a nice, pure library. In general, when using unsafePerformIO in this way, one wants to tell the compiler that it is not allowed to inline the expression. This can be done in most compilers by giving compiler pragma's. /Koen. -- Koen Claessen http://www.cs.chalmers.se/~koen Chalmers University, Gothenburg, Sweden.

On Thu, 26 Sep 2002 16:02:01 +0200 (MET DST)
Koen Claessen
In general, when using unsafePerformIO in this way, one wants to tell the compiler that it is not allowed to inline the expression. This can be done in most compilers by giving compiler pragma's.
In the need of a mutable configuration, I prefer to use the IO monad anyway, because one of the reasons I am studying haskell is to stop having to bother with "inline" and/or order of evaluation. Vincenzo -- Fedeli alla linea, anche quando non c'è Quando l'imperatore è malato, quando muore,o è dubbioso, o è perplesso. Fedeli alla linea la linea non c'è. [CCCP]

Koen,
getConfig :: Configuration getConfig = unsafePerformIO $ do ...read configuration from file... return configuration
(*) Actually, a Haskell compiler is free to inline these kind of expressions, so really one has to give a NOINLINE pragma to the compiler as well.
I'd always avoided this type of thing precisely because of the inline issue, which I don't think my version is in danger of. That's just me, though :).

There is another solution to the problem of configurational parameters. The main part of the solution is portable, does not depend on any pragmas, does not use unsafe operations, does not use implicit parameters, and does not require any modifications to the user code. I must warn that it is also potentially vomit-inducing. It seems that the problem at hand naturally splits into two phases: building the configuration environment, and executing some code in that environment. The phases are executed sequentially. The facts suggest the use of a SupedMonad. SuperMonad is very well known and often used, even by people who never heard of simpler monads. The following code is an illustration. Suppose file '/tmp/a.hs' contains the following user code, which is to run within the configuration environment provided by the module Config. For simplicity, our configuration is made of one Int datum, config_item:
File "/tmp/a.hs"
import Config (config_item)
foo = "foo shows: " ++ (show config_item)
bar = "bar shows: " ++ (show config_item)
main = do print foo print bar print foo
We specifically illustrate the reading of the config item several times. The following code runs the first phase: reads the configuration, build the SuperMonad and runs the SuperMonad.
import System (system, ExitCode(ExitSuccess))
myconfig_file = "/tmp/config"
phaseII_var = "/tmp/Config.hs" phaseII_const = "/tmp/a.hs"
nl = "\n"
writeConfig :: Int -> IO () writeConfig num = do writeFile phaseII_var $ concat ["module Config (config_item) where", nl, "config_item =", show num, nl]
runSuperIO () = system ("echo main | hugs " ++ phaseII_const) >>= \ExitSuccess -> print "Phase II done"
main = readFile myconfig_file >>= writeConfig . read >>= runSuperIO
I did warn you, didn't I? I have a hunch this solution will work with GHC even better than it works with Hugs. Perhaps we can even play with some dynamic linking tricks (like shared object initializers, etc). BTW, the solution above is similar in spirit to the following trick in C++: Config config; int main() { /* pure functional C++ code here -- yes, it exists*/} the constructor for 'config' is guaranteed to run before main(). Perhaps someone will implement Staged Haskell one day?

Evening all, Thanks for all the replies and suggestions! They've been extremely helpful. On Wed, Sep 25, 2002 at 04:34:02PM -0700, Hal Daume III wrote:
Okay, this is "bad" but I claim it's okay, iff it is used as in:
In the end, I opted for the global IORef through unsafePerformIO scheme; implicit parameters seemed too `bleedin' edge' for me... (that and I had in mind to also retro-fit the idea on top of WASH-CGI[0]; I don't want to make more changes than I absolutely have to.) (I haven't yet read John Hughes' paper on the topic though; that should be quite fun.) One thing to watch out for though: hGetContents is *not* your friend. At least it isn't if you're going to let it defer its work until you're inside modifyIORef... see the attached prefs.hs, line 28 onwards. Just for reference, I'm using GHC 5.02.2. I must have went almost insane trying to figure out why I was still getting the default settings, when every single line of the code seemed to state the contrary; I had already written up prefs.hs to post to the list to ask for help, when I noticed that if I put in the line ``mapM_ putStrLn kvs'' before I update the IORef, I get my new settings. If I leave it out, I get the defaults. Then it hit me -- hGetContents was lazily deferring its reads. A quickie alternative hGetLines seems to have fixed the problem. But I'm still not sure *why* that was a problem in the first place... Can anyone explain? (Looking at the library source didn't help.) Surely modifyIORef would have reduced kvs to head normal form (correct termino- logy?) just like putStrLn did? *sigh* This can't be good for my mental health. <g> Thanks everyone, /Liyang BTW: I'm on both the main and -cafe lists; you don't have to CC me. ^_- [0] http://www.informatik.uni-freiburg.de/~thiemann/WASH/ -- .--{ Liyang HU }--{ http://nerv.cx/ }--{ Caius@Cam }--{ ICQ: 39391385 }--. | ``Computer games don't affect kids, I mean if Pac Man affected us as | | kids, we'd all be running around in darkened rooms, munching pills and | | listening to repetitive music.'' |

Andrew J Bromage wrote:
There's also a much uglier solution which I occasionally use if I need an "ad hoc" global variable. Rather than using IORefs, I use Strings as keys. The code is here:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hfl/hfl/ioext/
I'm not sure why you consider the code you refer to above so ugly. In any case, I have a question and a comment on it. Question: Why do you use `seq` on `globalTableRef`? Comment: You use `addToFM` to replace entries in your table. Without additional logic to increase strictness, I think you unnecessarily risk stack overflow. Consider the case where `writeIOGlobal` is used many times on a global between uses of `readIOGlobal` on that global. The above issue raises a general question: How should strictness best be achieved with `addToFM`? -- Dean

G'day all. On Fri, Sep 27, 2002 at 12:56:38PM -0400, Dean Herington wrote:
I'm not sure why you consider the code you refer to above so ugly.
Anything which relies on unsafePerformIO (or seq, for that matter) is ugly. Personal opinion, of course. :-)
Question: Why do you use `seq` on `globalTableRef`?
Good question. It's actually a form of documentation. I wasn't 100% sure how concurrency and CAFs interact at the time (and I'm still not), so I left that in as a sort of note to myself to check this out. Admittedly a comment would have been clearer. :-)
You use `addToFM` to replace entries in your table. Without additional logic to increase strictness, I think you unnecessarily risk stack overflow.
That's true, although the case of many writes followed by a single read I would expect to be rare in practice. Besides, IOGlobal is not designed for performance. It's designed for quick hacks. Cheers, Andrew Bromage

This message seems to have been lost and I'd like to try to breathe some life into it. First, a question: could such "readFilePure" functions be implemented on TOP of the current IO module (perhaps in IO.Pure or something). Of course, one could do something like: readFileOnce :: FilePath -> Maybe String readFileOnce = unsafePerformIO .... {-# NOINLINE readFilePure #-} but this is the sort of thing we're trying to get away from anyway. There doesn't (to me, at least) seem to be an obvious way to do this. It seems to be the sort of thing that requires compiler support. In this case, do any of the compiler implementers have a heart to tackle such a thing? On the other hand, if there's a way to do it on top of what already exists, I would be more than happy to implement it if someone were to point me in the right direction...
The point is that the use of this function will typically happen at the beginning of a program, when reading the configuration file(s). When all this has happened, the function readFileOnce, and its memo table, will be garabage collected.
I like this, and it works for configuration files, but I have another problem I would like to solve with this whole ...Once business which does not fit into this model. I have a large database-like-file which essentially contains an index at the beginning. When you want to look up something, you binary search for the term in the index, find the position of the entity you want, seek to that location and then read a specified amount. The way I have this currently set up is that everything in my program is embedded in the IO monad because 1) the database is huge and i cannot store it all in memory 2) usually only about 100 out of 250000 entries are queried per run, but which entries these are change from run to run Unfortunately, this means all my functions are monadic. However, there's no reason for them to be (in a sense): they are perfectly pure. In fact, I don't even have write access to the database :), but no one would ever change it anyway. So while I like the 'readFileOnce' and variants, I think that if someone is serious about this '...Once' stuff, we should have more or less the entire reading portion of the IO library in pure format for cases like this. Thoughts? - Hal
participants (8)
-
Andrew J Bromage
-
Dean Herington
-
Hal Daume III
-
Jorge Adriano
-
Koen Claessen
-
Liyang Hu
-
Nick Name
-
oleg@pobox.com