
22 Mar
2006
22 Mar
'06
10:27 p.m.
Simon Marlow wrote:
Getting a UTF-8 decoder right is quite non-trivial. Take a look at this:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
I made a half-hearted attempt to get most of this right in GHC's UTF-8 decoder, but by no means all of it is implemented. I do think it would be nice if the Haskell implementation was correct, for some value of correct, though.
There are a couple of implementations here: http://cvs.sourceforge.net/viewcvs.py/haskell-i18n/Source/Text/Encoding/ SF project: http://sourceforge.net/projects/haskell-i18n It hasn't been touched for a couple of years. -- Ashley Yakeley, Seattle WA WWED? http://www.cs.utexas.edu/users/EWD/