
2 Nov
2011
2 Nov
'11
3:13 p.m.
On Wed, Nov 02, 2011 at 07:02:09PM +0000, Max Bolingbroke wrote: [snip some stuff I didn't understand. I think I made the mistake of entering a Unicode discussion]
This is why the unmodified PEP383 approach is kind of nice - it uses lone surrogate (rather than private use) codepoints to do the escaping, and these codepoints are simply not allowed to occur in valid UTF-encoded text.
If they do not occur, then why does it matter whether or not occurrences would get escaped? They are allowed to occur in Linux/ext2 filenames, anyway, and I think we ought to be able to handle them correctly if they do. Thanks Ian