
[moved from cafe to libraries]
David Roundy
[darcs] contains a few bits of code that you might find useful, such as an interface to libcurl (for lazily reading files over http or ftp)
This sounds like it would be a valuable contribution all by itself. Could you make it available as a package with its own (copy of the) license? A very brief look at the existing network support suggests that Network/C_URL might be a good place to put it. A quick glance at libcurl documentation suggests that it is quite portable (unix and win32).
darcs is currently unix-only (counting MacOS X as unix). A port to windows would be a fair amount of work, but probably would mostly be straightforward (mostly dealing with slashes versus backslashes and getting libcurl to work under windows).
This reminds me of a library I have been wanting for a while: functions for manipulating filenames in a portable way. The main requirement is that you should be able to manipulate a filepath as though it was structured something like so: (Maybe [Directory], Maybe Filename, Maybe Suffix) For example, on a Unix system, /usr/lib/libcurl.so would be treated something like this: (Just ["/","usr","lib"], "libcurl", Just "so") There's an awful lot of code out there that assumes that you can find the suffix of a filename using dropWhile (/='.') or reverse . takeWhile (/='.') . reverse (try them on "/foo.bar/baz.exe" and "/foo.bar/baz") and there is (presumably) much reinvention of the wheel when people decide to implement functions which really work. Oh yes, I should say that 'darcs' sounds quite cool as well... -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/

On Wed, Apr 09, 2003 at 04:02:36PM +0100, Alastair Reid wrote:
[moved from cafe to libraries]
David Roundy
writes: [darcs] contains a few bits of code that you might find useful, such as an interface to libcurl (for lazily reading files over http or ftp)
This sounds like it would be a valuable contribution all by itself. Could you make it available as a package with its own (copy of the) license? A very brief look at the existing network support suggests that Network/C_URL might be a good place to put it.
I'd be happy to package the curl stuff as a library, except that I am not sure my current code is portable beyond POSIX systems. And also, it's such a small amount of code (about 150 lines), that it seems weird to package it separately...
A quick glance at libcurl documentation suggests that it is quite portable (unix and win32).
Indeed, libcurl is nicely portable, but I doubt my code to lazily link to the libcurl library is portable. I use pthread_create and pipe to let curl do the downloading asynchronously... I'm not sure how portable either of those system calls are. If pthread_create is a problem, I could always use fork() instead, if that is more portable, but I have no idea whether that might be the case. pipe() I don't see how to get around, and I have no idea if it is supported on windows. Any idea if these are available on windows?
darcs is currently unix-only (counting MacOS X as unix). A port to windows would be a fair amount of work, but probably would mostly be straightforward (mostly dealing with slashes versus backslashes and getting libcurl to work under windows).
This reminds me of a library I have been wanting for a while: functions for manipulating filenames in a portable way. The main requirement is that you should be able to manipulate a filepath as though it was structured something like so:
(Maybe [Directory], Maybe Filename, Maybe Suffix)
For example, on a Unix system, /usr/lib/libcurl.so would be treated something like this:
(Just ["/","usr","lib"], "libcurl", Just "so")
For my purposes, I think a simpler system would be acceptable (partly because I don't deal with filename suffixes), which would simply swap between a "canonical" pathname (which might be user defined, or could just be required to be unix-like) and a "local" pathname. I imagine something like a (using a regexp paraphrase a la perl) canonical_to_local = s/\//\\/g (this is on windows, on unix it would be identity) and readFileCanonical = readFile . canonical_to_local so that replacing readFile with readFileCanonical would magically make your program accept unix-style filenames rather than windows ones. Of course, we'd also need to deal with unix filenames that have backslashes in them, so it wouldn't be this simple...
... and there is (presumably) much reinvention of the wheel when people decide to implement functions which really work.
If I ever get around to porting to windows (which I'm unlikely do myself, since I don't have windows, and don't relish the idea of trying to set up an mingw ghc cross-compiler) I'll see if I can figure out a nice way to package the filename code... :)
Oh yes, I should say that 'darcs' sounds quite cool as well...
Thanks! -- David Roundy http://civet.berkeley.edu/droundy/

David Roundy
Indeed, libcurl is nicely portable, but I doubt my code to lazily link to the libcurl library is portable. I use pthread_create and pipe to let curl do the downloading asynchronously... I'm not sure how portable either of those system calls are.
Anything to do with creating threads or processes is likely to have porting problems in my experience. I had a quick glance at the libcurl overview http://pinkstuff.publication.org.uk/cgi-bin/man2html?libcurl+5#lbAF and thought it could be used portably. Seems like I missed something. Alastair:
This reminds me of a library I have been wanting for a while [...]
David:
For my purposes, I think a simpler system would be acceptable [...]
Ah, you actually want something different from what I proposed. (I thought of the possibility as I was writing it but dismissed it as being of little use - your example shows exactly when it is useful. The real world is always tricker than you like to think...) What I proposed was that the library would parse filenames according to the local conventions on your machine - win32, unix, macintosh, etc. What you need is a set of libraries to parse filenames according to a set of conventions. For example, you might need to convert win32 filenames to unix filenames and back again.
If I ever get around to porting to windows (which I'm unlikely do myself, since I don't have windows, and don't relish the idea of trying to set up an mingw ghc cross-compiler) I'll see if I can figure out a nice way to package the filename code... :)
I can't help you gain access to windows but Hugs is many, many times easier to install than GHC - you don't need mingw, cygwin or anything like that. (Truth in advertising: to use the ffi (e.g., to interface to libcurl), you do need a C compiler :-() -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/

On Wed, Apr 09, 2003 at 05:40:31PM +0100, Alastair Reid wrote:
David Roundy
writes: Indeed, libcurl is nicely portable, but I doubt my code to lazily link to the libcurl library is portable. I use pthread_create and pipe to let curl do the downloading asynchronously... I'm not sure how portable either of those system calls are.
Anything to do with creating threads or processes is likely to have porting problems in my experience.
I had a quick glance at the libcurl overview
http://pinkstuff.publication.org.uk/cgi-bin/man2html?libcurl+5#lbAF
and thought it could be used portably. Seems like I missed something.
libcurl certainly can be used portably, but then it must be used synchronously (unless _I_ missed something!), but I want to be able to process the file as it is downloaded. The pretty way to do this, of course, is to have a lazy read function, but in order to implement that lazy read function I had to use a separate thread for the curl function that'll be doing the reading. This is nice, as it allows me to create a readUrl which is functionally identical to readFile, but is a bit of a pain as far as portability goes. So basically, yes libcurl could be used portably, but none of the code I've written is probably portable (beyond POSIX systems running ghc).
Alastair:
This reminds me of a library I have been wanting for a while [...]
David:
For my purposes, I think a simpler system would be acceptable [...]
Ah, you actually want something different from what I proposed. (I thought of the possibility as I was writing it but dismissed it as being of little use - your example shows exactly when it is useful. The real world is always tricker than you like to think...)
What I proposed was that the library would parse filenames according to the local conventions on your machine - win32, unix, macintosh, etc.
What you need is a set of libraries to parse filenames according to a set of conventions. For example, you might need to convert win32 filenames to unix filenames and back again.
I see the difference. But if we had the full filename parser that you proposed, it would be trivial to use it to implement the simple filename convention transformer that I would be interested in. i.e. I could write my transformation to a canonical form as parsed_path_to_unix :: ParsedPath -> String parse_path :: String -> ParsedPath local_to_canonical = parsed_path_to_unix . parse_path Assuming here that parse_path is defined on each platform to parse the local filename conventions.
I can't help you gain access to windows but Hugs is many, many times easier to install than GHC - you don't need mingw, cygwin or anything like that. (Truth in advertising: to use the ffi (e.g., to interface to libcurl), you do need a C compiler :-()
Hmmmm. I hadn't thought of using hugs! I do need ffi (for libcurl), but I already have the mingw c/c++ cross-compiler installed (I use it to compile the windows version of my bridge game), so I may be able to port to windows with less trouble that I had thought, by running hugs under wine... -- David Roundy http://abridgegame.org

Alastair Reid
[moved from cafe to libraries]
For example, on a Unix system, /usr/lib/libcurl.so would be treated something like this:
(Just ["/","usr","lib"], "libcurl", Just "so")
Isn't this a SMOP, writing functions: dirname :: FilePath -> String -- or FilePath? basename :: FilePath -> String suffix :: FilePath -> String They would probably be useful enough (and, as you point out, slightly tricky to get exactly right) to warrant standardization and inclusion. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Alastair Reid
[moved from cafe to libraries]
For example, on a Unix system, /usr/lib/libcurl.so would be treated something like this:
(Just ["/","usr","lib"], "libcurl", Just "so")
Ketil Z Malde
Isn't this a SMOP, writing functions:
dirname :: FilePath -> String -- or FilePath? basename :: FilePath -> String suffix :: FilePath -> String
SMOP == small matter of programming? Yes, it's pretty easy to do. But that small matter of programming gets repeated time and time again (with many shortcuts taken which limit portability or make incorrect assumptions about what are legal filenames) so I suggest that a high quality library we added. I'm sure your functions weren't intended as a final, polished API (though they look like the GNU make filename API which, since it is now set in concrete, is as final and polished as it is ever likely to get) but I'll point out some of the issues in the set of functions you suggest. 1) What should the functions return when there is no dirname, no basename or no suffix. An empty string suggests itself but can we then still distinguish between filenames like "foo." and "foo", "/foo" and "foo"? This is why I used 'Maybe' - though maybe I didn't use it enough in my sketch? 2) It's often enough to split the dirname from the basename as you suggest but I sometimes find myself needing to access a subdirectory or parent directory. So I write code like: dirname f ++ "/" ++ subdirname ++ "/" ++ notdir f or the cryptic reverse (takeWhile (/= '/') (reverse (dirname f))) ++ notdir f Both are fixed if there's a way to split the dirname into a list of directories so that we can add or remove bits at will. 3) We need a way to glue the various components back together again to eliminate those non-portable uses of '++ "/" ++' above. The obvious thing is to abstract the directory separator (typically '/' or '\') but then you have to be careful when adding or removing components from filenames that are relative or absolute, have or lack a dirname, have or lack a suffix, etc. I forget all the details of Windows filenames but you may also need to be careful when dealing with Windows drive letters and SMB mounted files on Windows. This is, in part, why I suggesting that there be a way to parse FilePaths into a richer structure. My thought was that as well as having operations to access the components, there would also be operations to modify the components (cf. record updates) - the idea being that if you want to change the suffix, you don't have to figure out all the things you want to remain constant, you just have to figure out the things you want to change. (The other reason for suggesting what the internal structure would be comes from my background in algebraic specification. Given a structure which is semantically equivalent to a tuple (as I believe filenames ought to be viewed), we can just say it is equivalent to a tuple (a model-based specification) or we can give a set of equations in the algebraic specification style. My experience is that, in this case, the model-based style scales better (i.e., is shorter) and is easier to understand (because it exploits existing understanding/intuition).) -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/

This reminds me of a library I have been wanting for a while: functions for manipulating filenames in a portable way. The main requirement is that you should be able to manipulate a filepath as though it was structured something like so:
(Maybe [Directory], Maybe Filename, Maybe Suffix)
(I think you mean Filename, not Maybe Filename)
For example, on a Unix system, /usr/lib/libcurl.so would be treated something like this:
(Just ["/","usr","lib"], "libcurl", Just "so")
The "/" would be a wart... it would be better to have an "origin" component, which could be either Absolute or Relative. (Origin, Maybe [Directory], Filename, Maybe Suffix) Something else that should be handled is scp-style remote paths: Maybe Machine, as in "astrocyte:/tmp/myfile.txt": (Just "astrocyte", Absolute, Just "tmp", "myfile", Just "txt") --KW 8-)

On Wed, Apr 09, 2003 at 04:02:36PM +0100, Alastair Reid wrote:
This reminds me of a library I have been wanting for a while: functions for manipulating filenames in a portable way. The main requirement is that you should be able to manipulate a filepath as though it was structured something like so:
(Maybe [Directory], Maybe Filename, Maybe Suffix)
For example, on a Unix system, /usr/lib/libcurl.so would be treated something like this:
(Just ["/","usr","lib"], "libcurl", Just "so")
I'm thinking a better definition (similar to Keith's objection, but different) would be something like: (Origin, [Directory], Maybe Filename, Maybe Suffix) where Origin is a string which gives the absolute origin of the path. In the case of relative paths this would either be "." or "" (we'd have to choose either one or the other). I think that the origin of the path (btw I don't like the name Origin here) is a unique item. Keith gave the example of an scp style path abridgegame.org:my/file, but there is also the issue of dealing with windows paths C:\\whatever (where we obviously don't want to interperet this as being in a directory named "C:" which is in the current directory). -- David Roundy http://www.abridgegame.org
participants (6)
-
Alastair Reid
-
David Roundy
-
David Roundy
-
David Roundy
-
Keith Wansbrough
-
ketil@ii.uib.no