
On 08 January 2005 08:09, Aaron Denney wrote:
On 2005-01-07, Simon Marlow
wrote: - Can you use (some encoding of) Unicode for your Haskell source files? I don't think this is true in any Haskell compiler right now.
I assume this won't be be done until the next one is done...
Not necessarily; GHC doesn't use the standard IO library for reading source files.
- Can you do String I/O in some encoding of Unicode? No Haskell compiler has support for this yet, and there are design decisions to be made. Some progress has been made on an experimental prototype (see recent discussion on this list).
Many of the easy ways to do this that I've heard proposed make the current hacks for binary IO fail.
Making hacks fail isn't necessarily a bad thing :-)
IMHO, we really, really, need a standard, supported way to do binary IO.
I agree, but I think it should be part of a larger redesign of the IO library. The streams proposal includes binary I/O, by the way. I'm not keen to provide binary IO on top of the existing IO library, and then to have Unicode as a layer on top of that. Performance will be terrible. It needs to be designed properly from the ground up.
If I can read in and output octets, then I can implement unicode handling on top of that. In fact it would let a bunch of the proposed ideas for unicode support can be implemented in pure haskell and have API details hashed out and polished.
For unix, there are couple different tacks one could take. The locale system is standard, and does work, but is ugly and a pain to work with. In particular, it's another (set of) global variables. And what do you do with a character not expressible in the current locale?
I'd like to possibility of different character sets for different files, for example.
Not a problem. Have you looked at the streams proposal? Cheers, Simon