RE: [Haskell-i18n] unicode notation \uhhhh implementation

On Fri, 2002-08-16 at 11:01, Simon Marlow wrote:
I wasn't aware of that paragraph in the report until recently, and as far as I know none of the current Haskell implementations implement the '\uhhhh' escape sequences.
HBC implemented Unicode years ago.
No, HBC doesn't implement the paragraph of the report that we're talking about. HBC allows the '\uhhhh' escape sequence in characters and string literals, but not in identifiers and other parts of the source.
Also, it's not clear to me why you need '\uhhhh' escape sequence in character and string literals at all, since it appears to mean the same thing as '\xhhhh' (the report isn't clear that '\xhhhh' means a "unicode code point", but that seems to be the only reasonable interpretation).
You're most probably right, this looks like a misinterpretation on HBC's side to me, too.
One reason to use this approach would be if there already existed a preprocessor to do the job - does anyone know of one?
Can't be more than a few lines of Perl. It's quite short in Haskell too:
convert :: String -> String convert ('\\':'u':c1:c2:c3:c4:cs) | isHex c1 && isHex c2 && isHex c3 && isHex c4 = chr (readHex [c1,c2,c3,c4]) : convert cs | otherwise -- not clear if this is = error "Malformed unicode sequence" -- allowed by the spec convert (c:cs) = c : convert cs convert [] = []
I meant a preprocessor to take source code in some random encoding and convert it into ASCII with '\uhhhh' escape sequences. If there was such a thing, then we could all use it and save re-implementing N different encodings in each compiler.
There is GNU recode which knows virtually all kinds of codecs. It's homepage says something about extensibility, so it might be fairly easy to add our own \uhhhh (or whatever we settle on) escaped ASCII to the list of codecs. It could then convert back and forth between this and any other encoding it is aware of. My version reports 281 supported encodings. http://www.gnu.org/software/recode Regards, Sven Moritz
participants (1)
-
Sven Moritz Hallberg