
Ian Lynagh wrote:
=========== The problem ===========
With it's closer adherence to the Haskell 98 report, it is no longer possible with hugs to manipulate files using the standard IO functions if the filenames are not representable in your locale.
Note that this basically means your filesystem is broken. This situation can only occur if a filesystem is written in one and then read in another locale. This "problem" cannot really be fixed, only worked around.
UTF-8: 65533 = U+FFFD = "replacement character"
================= Proposed solution =================
I have a simpler proposal: allocate 128 "replacement characters" in the "Vendor Zone" of Unicode. Their purpose is as place holders for incorrect UTF8. Then use these replacement characters when decoding UTF8 and reproduce the original, broken, code when re-encoding. Under ordinary circumstances these codes should never occur in strings.
======================= Backwards compatibility =======================
comes at no additional cost ;-) Udo. -- It's not that perl programmers are idiots, it's that the language rewards idiotic behavior in a way that no other language or tool has ever done. -- Erik Naggum