Re: [Haskell-cafe] Hugs vs GHC (again) was: Re: Some random newbiequestions

7 Jan 2005


      "Simon Marlow"  writes:
...
Here's a summary of the state of Unicode support in GHC and other
compilers.  There are several aspects:
- Can the Char type hold the full range of Unicode characters?
   This has been true in GHC for some time, and is now true in Hugs.
   I don't think it's true in nhc98 (please correct me if I'm wrong).
You're wrong :-).  nhc98 has always had 32-bit characters internally.
...
- Do the character class functions (isUpper, isAlpha etc.) work
   correctly on the full range of Unicode characters?  This is true in
   Hugs.  It's true with GHC on some systems (basically we were lazy
   and used the underlying C library's support here, which is patchy).
In nhc98, currently the character class functions work only on the
8-bit Latin-1 range.
...
- Can you use (some encoding of) Unicode for your Haskell source files?
   I don't think this is true in any Haskell compiler right now.
Many years ago, hbc claimed to be the only compiler with support for this.
...
- Can you do String I/O in some encoding of Unicode?  No Haskell
   compiler has support for this yet, and there are design decisions
   to be made.  Some progress has been made on an experimental prototype
   (see recent discussion on this list).
Apparently some Haskell/XML toolkits already do I/O conversions in a
selection of the encodings permitted by the XML standard, namely ASCII,
Latin-1, UTF-8, and UTF-16 (either byte ordering), but not yet UCS-4
(four possible byte orderings), or EBCDIC.  See for example:
  http://www.ninebynine.org/Software/HaskellUtils/HaXml-1.12/src/Text/XML/HaXm...
...
- What about Unicode FilePaths?  This was discussed a few months ago
   on the haskell(-cafe) list, no support yet in any compiler.
Indeed, AFAIK.

Regards,
    Malcolm

Re: [Haskell-cafe] Hugs vs GHC (again) was: Re: Some random newbiequestions

Malcolm Wallace