New subject: Text I/O library proposal, first draft

4 Aug 2003

      In article ,
 Ben Rudiak-Gould  wrote:
...
[Crossposted to Haskell and Libraries. Replies to Libraries.]
There's a Haskell Internationalistion mailing list too. Also check out 
the project on SF:
http://sourceforge.net/projects/haskell-i18n/
There's a bunch of my code for Unicode properties, plus a couple of UTF8 
implementations.
...
module System.TextIOFirstDraft (...) where
could be put in Text.* hierarchy
...
type BlockRecoder from to =
  Ptr from -> BlockLength -> Ptr to -> BlockLength
   -> IO (BlockLength,BlockLength)
UArray and MArray would be slightly cleaner if you're doing the IO 
thing. But actually my biggest problem is that this is in the IO monad. 
Given your code, I should be able to write these without resorting to 
unsafePerformIO:

  encodeUTF8 :: String -> [Word8]
  decodeUTF8 :: [Word8] -> Maybe String -- Nothing if not valid

Actually, if one makes certain assumptions about encodings, you could 
get away with something like this:

  type Encoder base t = t -> [base]
  type Decoder base t = forall m. (Monad m) => m base -> m t

Is this any less efficient? Probably not if you're writing your 
BlockRecoders in Haskell.
...
type TextEncoder = BlockRecoder Word32 Octet
type TextDecoder = BlockRecoder Octet Word32
On GHC, Char has exactly the range 0 to 0x10FFFF, as per Unicode 
codepoints. If this becomes standardised as part of an 
internationalisation effort, you might want to use Char rather than 
Word32.

-- 
Ashley Yakeley, Seattle WA

Re: Text I/O library proposal, first draft

Ashley Yakeley

Ashley Yakeley

Ben Rudiak-Gould

tags

participants (2)