
Wolfgang Thaller
In what way is ISO-2022 non-reversible? Is it possible that a ISO-2022 file name that is converted to Unicode cannot be converted back any more (assuming you know for sure that it was ISO-2022 in the first place)?
I am no expert on ISO-2022 so the following may contain errors, please correct if it is wrong. ISO-2022 -> Unicode is always possible. Also Unicode -> ISO-2022 should be always possible, but is a relation not a function. This means there are an infinite? ways of encoding a particular unicode string in ISO-2022. ISO-2022 works by providing escape sequences to switch between different character sets. One can freely use these escapes in almost any way you wish. Also ISO-2022 makes a difference between the same character in japanese/chinese/korean - which unicode does not do. See here for more info on the topic: http://www.ecma-international.org/publications/files/ecma-st/ECMA-035.pdf Also trusting system locale for everything is problematic and makes things quite unbearable for I18N. e.g. on my desktop 95% of things run with iso-8859-1, 3% of things use utf-8 and a few apps use EUC-JP... Using filenames as opaque blobs causes the least problems. If the program wishes to display them in a graphical environment then they have to be converted to a string, but very many apps never display the filenames... - Einar Karttunen