
This is a very good summary, and I'm interested to see what you come up with. robert dockins wrote:
1) File names are abstract entities. There are a number of ways one might concretely represent a filename. Among these ways are:
a) A contiguous sequence of octets in memory (C style string on most modern hardware) b) A sequence of unicode codepoints (Haskell style string)
b') A sequence of octets (Haskell style string, in real life)
4) In practice, the vast majority of file paths are portable between the various forms; the forms are "nearly" isomorphic, with corner cases being fairly rare.
I don't think they're so rare. I have files on my XP laptop which can't be represented in the system code page. It's easy for me to tell which programs are Unicode-aware and which aren't. -- Ben