
I have been ruminating on the various responses my attempted file path implementation has generated. I have a design beginning to form in the back of my head which attempts to address the file path problem as I lay out below. Before I develop it any further, are there any important considerations I am missing? Here is my conception of the file name problem: 1) File names are abstract entities. There are a number of ways one might concretely represent a filename. Among these ways are: a) A contiguous sequence of octets in memory (C style string on most modern hardware) b) A sequence of unicode codepoints (Haskell style string) c) Algebraic datatypes supporting path manipulations (yet to be developed) 2) We would like these three representations to be isomorphic. Unfortunately, this cannot be. In particular, there are major issues with the translations between the (a) and (b) forms given above. One could imagine that translations issues involving the (c) form are also possible. 3) Translations between (a) and (b) must be parameterized by a character encoding. Translations to and from (c) will require some manner of description of the path syntax, which differs by OS. 4) In practice, the vast majority of file paths are portable between the various forms; the forms are "nearly" isomorphic, with corner cases being fairly rare. 5) Translations between the various forms cost compute cycles and memory, and are not necessarily bijective. Therefore, translations should occur _only_ if absolutely necessary. In particular, if a file name passes through a program as a black box (it is not examined or manipulated) it should undergo no transformation. 6) Different OSes handle file names differently. These differences should be accounted for, transparently where possible. These differences, however, should be exposed to developers for whom the difference matter. 7) Using simple file names should be easy. We don't want developers to have to worry too much about character encodings, path separators, and generally bizarre path syntax just to open files. The complexities of correct file name handling should be hidden from the casual programmer. However, developers interested in serious portability/internationalization should be able to get down into the muck if they need to.