
Hello Malcolm, Tuesday, March 21, 2006, 7:07:53 PM, you wrote:
I was also thinking it would be nice to have pure Haskell implementations of the various Unicode encodings. Here is my attempt at the UTF-8 codec.
UTF-8 codecs are migrating from app to app, you can find such code in the ghc, jhc, darcs... all these codecs use the ([Char] <-> [Word8]) conversion that is both slow (because lists are lazy) and can't be used in non-list environment (how, for example, we can read enough bytes to decode just one Char?). in my Streams library, i used higher-order monadic functions to implement encodings. In my model, encoder is just a higher-order function that accepts as parameter function (putByte :: (Monad m) => Word8 -> m ()) and uses it to implement (putChar :: (Monad m) => Char -> m ()) operation, so each encoder has type: utf8Encode :: (Monad m) => (Word8 -> m ()) -> Char -> m () In the same fashion, each decoder accepts parameter of functional type (getByte :: (Monad m) => m Word8), and uses it to implement (getChar :: (Monad m) => m Char) operation, so the whole decoder has type: utf8Decode :: (Monad m) => m Word8 -> m Char Using these higher-order functions allows me to implement both UTF8 (and any other) encoding for text streams and UTF8 encoding for serializing strings/chars in binary i/o module. i attached this module. -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com