implementation of file-related modules

Greetings, I've been looking over some of the file-related modules such as System.Directory and System.Posix.{Directory,Files}, and I have some questions. * Excuse my ignorance of Windows, but I'm assuming that the System.Posix module does not exist on Windows? Is windows the only system where this is the case? * I've noticed that System.Directory.renameFile behaves differently between ghci, nhc, and hugs. In particular, ghc is the only one who complains when it's a directory. Is there anything evil about using renameFile to rename a directory? It would be nice if they all acted the same, particularly since renameDirectory is conveniently located in the same module. * Speaking of which, in System.Posix, the Directory and File operations are broken up into separate modules. It is a little odd to have file related functions in System.Directory. Should they be moved? * Some languages have a means of building paths in a portable way. It would be nice if we had access a file separator (like "/" in unix and "\" in windows). These would probably belong in the System.Directory module? Java (for instance) allows access to these things through the java.lang.system.getProprerty function. That function takes a string and returns a string: "path.separator" Path separator (for example, ":") "file.separator" File separator (for example, "/") "user.home" User home directory "user.name" User account name // might come in useful: "os.arch" Operating system architecture "os.name" Operating system name "os.version" Operating system version Now using getProperty for this makes sense in Java, but for Haskell, I would expect to have functions for these. I'm not sure where all of them would go. Is there any roadblock to implementing something like this? * Another item that would be useful in the System.Directory class would be some kind of config file path. On Debian, that would be "/etc", on some systems, it might tend to be more often "/usr/local/etc". (I'm not sure if there's a good way to abstract the difference between a user config file and a system config file. User config files tend to be in ~/ and start with a dot whereas system config files end up in /etc, and don't start with a dot.) These last few points are rather related in that they are both dependent on particular operating systems, and a means of abstracting away system-specific details in order to write more portable code. Are any of these good ideas? peace, isaac

For the last, see: http://www.haskell.org/pipermail/haskell/2003-July/012314.html and replies. I still want *something* rather than nothing. Nice would even be: makePath :: [FilePath] -> IO FilePath which concats the directories and puts the appropriate separator between them. On Wed, 2003-10-08 at 20:18, Isaac Jones wrote:
Greetings,
I've been looking over some of the file-related modules such as System.Directory and System.Posix.{Directory,Files}, and I have some questions.
* Excuse my ignorance of Windows, but I'm assuming that the System.Posix module does not exist on Windows? Is windows the only system where this is the case?
* I've noticed that System.Directory.renameFile behaves differently between ghci, nhc, and hugs. In particular, ghc is the only one who complains when it's a directory. Is there anything evil about using renameFile to rename a directory? It would be nice if they all acted the same, particularly since renameDirectory is conveniently located in the same module.
* Speaking of which, in System.Posix, the Directory and File operations are broken up into separate modules. It is a little odd to have file related functions in System.Directory. Should they be moved?
* Some languages have a means of building paths in a portable way. It would be nice if we had access a file separator (like "/" in unix and "\" in windows). These would probably belong in the System.Directory module? Java (for instance) allows access to these things through the java.lang.system.getProprerty function. That function takes a string and returns a string:
"path.separator" Path separator (for example, ":") "file.separator" File separator (for example, "/") "user.home" User home directory "user.name" User account name
// might come in useful: "os.arch" Operating system architecture "os.name" Operating system name "os.version" Operating system version
Now using getProperty for this makes sense in Java, but for Haskell, I would expect to have functions for these. I'm not sure where all of them would go. Is there any roadblock to implementing something like this?
* Another item that would be useful in the System.Directory class would be some kind of config file path. On Debian, that would be "/etc", on some systems, it might tend to be more often "/usr/local/etc". (I'm not sure if there's a good way to abstract the difference between a user config file and a system config file. User config files tend to be in ~/ and start with a dot whereas system config files end up in /etc, and don't start with a dot.)
These last few points are rather related in that they are both dependent on particular operating systems, and a means of abstracting away system-specific details in order to write more portable code. Are any of these good ideas?
peace,
isaac _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries -- Hal Daume III | hdaume@isi.edu "Arrest this man, he talks in maths." | www.isi.edu/~hdaume

Hal Daume III
For the last, see:
http://www.haskell.org/pipermail/haskell/2003-July/012314.html
and replies.
I forgot about this thread. I knew I'd been thinking about this for a few months, now I know why ;)
I still want *something* rather than nothing. Nice would even be:
makePath :: [FilePath] -> IO FilePath
And likewise, I think you'd want a way to parse a path into its components. About the drives in Windows, java has a function somewhere called "get roots" which on unix will give you "/" and on windows will give you "C:", "D:", etc. This makes sense to me. But about the case sensitivity, I have no idea. I agree that something would be better than nothing. peace, isaac

* Some languages have a means of building paths in a portable way.
The last discussion suggested that we might tackle this using the hierarchial module structure. That is, there would be System.Directory.Unix, System.Directory.Windows, System.Directory.Mac, etc. and then System.Directory.Native would be a redirection module that points to the appropriate module for that system. The Java approach you describe uses configuration files instead of modules. In favour of the module-based approach, there is stronger typechecking, greater flexibility (do we need this?) and the ability to manipulate filenames from multiple platforms at once (useful if you need to convert filenames from one platform to another). In favour of the configuration file approach, there's the potential for shipping portable bytecode files from one platform to another and it can probably be tweaked to support multiple platforms at once by providing a 'Configuration' abstract data type. I think that if we had a configuration file, we would want to treat them as runtime constants so that operations which depend on configuration constants don't need to be in the IO monad. And if we wanted a more general configuration file mechanism, we would do well to provide hooks so that embedded systems written in Haskell don't find themselves needing a filesystem just so that they have somewhere to read the configuration file from. (This is only really relevant if configuration files are used to configure more than just System.Directory since most programs that manipulate filenames tend to need a filesystem anyway.)
These last few points are rather related in that they are both dependent on particular operating systems, and a means of abstracting away system-specific details in order to write more portable code. Are any of these good ideas?
I think so, yes. A good library of filename manipulating functions would be useful. Following Java's lead and using configuration files to localize a system would potentially allow us to ship portable bytecode files around from one platform to another. A general mechanism for accessing configuration strings would be useful. -- Alastair Reid www.haskell-consulting.com

Am Donnerstag, 9. Oktober 2003, 10:24 schrieb Alastair Reid:
[...]
Following Java's lead and using configuration files to localize a system would potentially allow us to ship portable bytecode files around from one platform to another.
Do we really need configuration files for this. Wouldn't the same be possible with a *.Native redirection module? In general, I would be very careful with following Java because, in my opinion, Java solutions are often not very nice (consider, e.g., Java's type system prior to J2SE 1.5).
A general mechanism for accessing configuration strings would be useful.
I would strongly favour mechanisms which give us type safety instead of mechanisms just using strings.
[...]
Wolfgang

Alastair Reid
Following Java's lead and using configuration files to localize a system would potentially allow us to ship portable bytecode files around from one platform to another.
Actually, just delaying binding to System.Directory.Native would be enough (and I bet in such a system we'd want to delay binding to the prelude).
A general mechanism for accessing configuration strings would be useful.
Anyone taking up this task would get to choose a format. A chance to foist your aesthetics on the community! Surely that's irresistable to *somebody*. -Jan-Willem Maessen jmaessen@alum.mit.edu

Jan-Willem Maessen
Alastair Reid
writes: Following Java's lead and using configuration files to localize a system would potentially allow us to ship portable bytecode files around from one platform to another.
Actually, just delaying binding to System.Directory.Native would be enough (and I bet in such a system we'd want to delay binding to the prelude).
So how would this work? Some #ifdef sections in the implementations that bind System.Directory.Native, depending on which platform we're on? Would this be very hard? Are there more complex issues that I'm not seeing?
A general mechanism for accessing configuration strings would be useful.
Anyone taking up this task would get to choose a format. A chance to foist your aesthetics on the community! Surely that's irresistable to *somebody*.
I like the idea of type safety, but the configuration strings are probably still useful for some things. How might this look? - A constant CONFDIR (#ifdef'd into the implementations) telling where to find configuration files (a kind of bootstrapping) - A configuration file in CONFIGDIR with (var, val) tuples - The runtime system reads this file at startup? The virtue of this is that there is less work for the haskell implementors if there's just one #ifdef and everything else comes in as strings. Reading the file at runtime, however, seems a little strange since lots of the stuff we're talking about is actually known when you build the compiler. OTOH, maybe the work of having #ifdefs in the implementations isn't that hard? Obviously, there's platform-dependent configuration going on, but is there any going on that is eventually made available as constants or functions? Is there an implementation-neutral way to implement this? peace, isaac

I like the idea of type safety, but the configuration strings are probably still useful for some things. How might this look?
- A constant CONFDIR (#ifdef'd into the implementations) telling where to find configuration files (a kind of bootstrapping)
What are those config strings really for? File path delimiters aren't something that should have to be specified in a separate "config file". As a long-time Mac user and victim of Apple's brainwashing, I hate config files, so I'm always sceptical about proposed schemes of config files. (Of course, I don't want to veto anything, I just want to dissuade people from inventing config file schemes "just in case we need them one day"). Cheers, Wolfgang

Wolfgang Thaller
As a long-time Mac user and victim of Apple's brainwashing, I hate config files, so I'm always sceptical about proposed schemes of config files. (Of course, I don't want to veto anything, I just want to dissuade people from inventing config file schemes "just in case we need them one day").
I can agree with that. I'm more-or-less trying to ask what each solution _would_ look like. I feel that the Library Infrastructure Project will be running into a lot of these platform dependent variables, and I want advice for a solution (and would be happy to see the solution itself too!) peace, isaac

Alastair Reid
In favour of the configuration file approach, there's the potential for shipping portable bytecode files from one platform to another and it can probably be tweaked to support multiple platforms at once by providing a 'Configuration' abstract data type.
I was thinking about this a little more, and I came up with a small argument against it (or at least part of it): For something like path separators, it seems fine to have Directory.Windows, Directory.Unix, and Directory.Mac, then have Directory.Native bind at compile time. But for something like "configuration file directory", which might be "/etc" on Debian, "/usr/local/etc" on FreeBSD, a wide variety of things on various windows configurations (at least for user configuration files, not sure about system config files), and "/Library/Preferences" on MacOS X, this solution starts to look a little uglier since it requires a module for each platform, which somewhat randomly pollutes the namespace for something like Directory. It's even worse if you assume that we'll run into other modules where this matters. What if we had a module for platform-specific things and submodules for each platform. This would keep all of the platform-specific details localized and easier to maintain. So perhaps we could have Platform.{Debian,FreeBSD,MacOSX,Windows98,Windows2000}.{configDirectory,fileSeparator,etc} or: Platform.{Debian,FreeBSD,MacOSX,Windows98,Windows2000}.Directory.{configDirectory,fileSeparator,etc} and Platform.Native.{configDirectory,fileSeparator,etc} or Platform.Directory.{configDirectory,fileSeparator,etc} which is bound as above. But that sorta stinks because fileSeparator probably belongs in the Directory module. I say that's OK; we can still have: Directory.fileSeparator = Platform.Native.fileSeparator So this still lets you get to the platform specific stuff, but doesn't clutter up random parts of the module hierarchy with platform-specific modules. It lets you treat Directory.fileSeparator in a very natural way for most uses. Opinions? peace, isaac

For something like path separators, it seems fine to have Directory.Windows, Directory.Unix, and Directory.Mac, then have Directory.Native bind at compile time.
Note that it is (occasionally) useful to use Directory.Windows on a program running on a Unix box. One example would be a CVS-like application which has to translate filenames from one format to another.
So perhaps we could have
Platform.{Debian,FreeBSD,MacOSX,Windows98,Windows2000}. {configDirectory,fileSeparator,etc}
I find it harder to think why I would want to know where Windows machines store their configuration files. So I think there is value in having multiple versions of file-related modules available in each system but I'm much less convinced that having multiple configuration file locations available is useful. -- Alastair

Platform.{Debian,FreeBSD,MacOSX,Windows98,Windows2000}. {configDirectory,fileSeparator,etc}
I find it harder to think why I would want to know where Windows machines store their configuration files.
Furthermore, the location isn't fixed on Windows: it might be C:\WINDOWS\System (the most usual), or D:\WINDOWS\System (also possible), or <arbitrary-windows-path>\System (because the user can specify anything they like at install time). So config-file location has to be per machine, not per OS. Which suggests not putting it in the module heirarchy... Platform.UK.AC.CAM.CL.GLIA-NT.configDirectory, anyone? :) --KW 8-)

Keith Wansbrough wrote:
Platform.{Debian,FreeBSD,MacOSX,Windows98,Windows2000}. {configDirectory,fileSeparator,etc}
I find it harder to think why I would want to know where Windows machines store their configuration files.
Furthermore, the location isn't fixed on Windows: it might be C:\WINDOWS\System (the most usual), or D:\WINDOWS\System (also possible), or <arbitrary-windows-path>\System (because the user can specify anything they like at install time).
So config-file location has to be per machine, not per OS.
Which suggests not putting it in the module heirarchy... Platform.UK.AC.CAM.CL.GLIA-NT.configDirectory, anyone? :)
I think it is a platform-dependent IO action: getConfigDirectory :: IO String This might simply be getConfigDirectory = return "/etc/" ... but on Windows, where the directories aren't fixed, and on Mac OS, where the directories are fixed but developers are told not to rely on that fact, we can use OS-specific functions for finding the correct location. (Bit is \Windows\System really for config files? I thought DLLs and drivers go there... but then I don't use Windows much.) Cheers, Wolfgang

(Bit is \Windows\System really for config files? I thought DLLs and drivers go there... but then I don't use Windows much.)
Well, in directories around there. Of course, there are various other
places too. But %windir%\system32\drivers\etc\ is where at least some
config files go.
--KW 8-)
--
Keith Wansbrough

Isaac Jones
* Some languages have a means of building paths in a portable way. It would be nice if we had access a file separator (like "/" in unix and "\" in windows).
If this is really necessary, I'd prefer it if it was taken care of "under the hood". FilePath could probably be a more complex data type (instead of the current "String"), perhaps dealt with as a list/tuple of components.
"path.separator" Path separator (for example, ":")
I assume this is used to separate different paths when packing them in a string. This seems very application dependent, I'm not sure I see the utility of it.
"file.separator" File separator (for example, "/")
Necessary if you need to construct file paths as strings. Let's try to avoid a tangle of "...++System.FilePath.getFileSeparator++..." constructs in code, though. I think we either should standardize the string format, or design a new data type with a simple interface to access it. How are the languages with more general facilities doing this? How often is it useful?
"user.home" User home directory "user.name" User account name
Nice to have, I think. Simple to get from the environment, but having standard wrappers would probably be a good idea. Make a module, (System.Info?) with the appropriate functions?
* Another item that would be useful in the System.Directory class would be some kind of config file path.
This would be application -- and installation -- dependent, I'm sure?
On Debian, that would be "/etc", on some systems, it might tend to be more often "/usr/local/etc".
I mean, if I compile an application myself, I use ./configure --prefix=/usr/local, and want it to keep its system-wide config in /usr/local/etc. Installing from a .deb would put config in /etc, installing in my home directory would do something else again.
User config files tend to be in ~/ and start with a dot whereas system config files end up in /etc, and don't start with a dot.)
In sum, this isn't something to standardize in the language, let the application writers decide for themselves. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

On 2003-10-09 at 10:42+0200 ketil+haskell@ii.uib.no wrote:
Isaac Jones
writes: * Some languages have a means of building paths in a portable way. It would be nice if we had access a file separator (like "/" in unix and "\" in windows).
If this is really necessary, I'd prefer it if it was taken care of "under the hood".
Why can't we just adopt URL (is that the right one or is it URI or URN) notation? Ideally we'd have something that handled all the possible protocols, but using the post ":" part of file://whatever would cover the syntactic differences mentioned above. -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

On Wed, Oct 08, 2003 at 11:18:15PM -0400, Isaac Jones wrote:
* I've noticed that System.Directory.renameFile behaves differently between ghci, nhc, and hugs. In particular, ghc is the only one who complains when it's a directory.
This function is also Directory.renameFile, for which the Report agrees with ghc (except for the exception thrown -- H98 has a limited choice). Hugs will conform in future. There's a minor bug in the GHC version: any errors are tagged with the first argument (ditto for renameDirectory).
participants (10)
-
Alastair Reid
-
Hal Daume III
-
Isaac Jones
-
Jan-Willem Maessen
-
Jon Fairbairn
-
Keith Wansbrough
-
ketil+haskell@ii.uib.no
-
Ross Paterson
-
Wolfgang Jeltsch
-
Wolfgang Thaller