
Magnus Therning wrote:
2. Codecs, i.e. encoder/decoder pairs such as charset converters data Codec base derived = MkCodec { encode :: derived -> base, decode :: base -> Maybe derived -- or other Monad } utf8 :: Codec [Word8] String xml :: Codec String XML
type ASCII = String base16 :: Codec ASCII [Word8] ...
encode base16 [0xde,0xad,0xbe,0xef] :: ASCII
A similar result could be gotten by using phantom types, right?
Most likely, although I'm not sure whether the choice from your blog is the right one. I mean, the only-a-little-bit-phantom type newtype Base16 a = Base16 { unBase16 :: a } deriving (Eq,Show) will do the job too instance DataEncoding Base16 where encode = Base16 . b16Encode decode = b16Decode . unBase16 chop n = Base16 . b16chop n . unBase16 unchop = Base16 . b16unchop . unBase16 liberate = unBase16 incarcerate = Base16 Usually, the "normal" phantom type approach would be to make the encoding a phantom argument of a string type, not the other way round: newtype EncodedString enc = ES String data Base16 -- empty type, no constructors instance DataEncoding (EncodedString Base16) where ... But your idea of fixing the encoding in the type for more type safety is good. Another way to do that would be to have an abstract data type -- this is not a String, this is base16-encoded data! newtype Base16 = Base16 String with functions encode :: [Word8] -> Base16 decode :: Base16 -> [Word8] and functions encode :: Base16 -> String decode :: String -> Maybe Base16 The "normal" phantom type approach has the advantage of making the last functions polymorphic encode :: EncodedString enc -> String decode :: String -> EncodedString enc encode (ES s) = s decode s = ES s at the expense of shifting the possible failure to decode :: EncodedString Base16 -> Maybe [Word8] Of course, you can use both phantom types and the codec approach eliminating the need for a type class base16 :: Codec [Word8] (EncodedString Base16) string :: Codec (EncodedString a) String
But then there must be some way of liberating the result. I'm not sure yet whether they are worth it.
AFAIU the example from above then changes to
encode [0xde,0xad,0xbe,0xef] :: Base16 ASCII
Concerning the choice between encoding the encoding (... ;-) in the types (like Base16) or as values (like base16 :: Codec ...), the observation is that you have to specify the encoding anyway :) either as type annotation ("type argument") encode [0xde,0xad,0xbe,0xef] :: EncodedString Base16 encode' (undefined :: Base16) [0xde,0xad,0xbe,0xef] or as value argument encode base16 [0xde,0xad,0xbe,0xef] In this case, I would prefer the value argument approach for its brevity and mnemonics ("encode in base16 the following data"). However, possible strong type guarantees usually are a good argument for the typed approach. To be true, I'm not really sure whether strong types would gain us something here.
Also, I don't have a clue about what chop and unchop are supposed to do.
For some encodings there are standard ways of splitting an encoded string over several lines. Unfortunately it's not always as simple as just splitting a string at a particular length. Uuencode is the most complicated I've come across so far. That's what chop/unchop is for.
Ah, that's what they are for. An idea would be to build the line length into the encoding, like base16 :: Int -> Codec [Word8] [String] with the intention that encode (base16 70) x will encode x with a line length of 70 characters. Hm, should decode (base16 70) s fail when the lines are not 70 characters in length, or should it accept any line length? Maybe it should be basae16 :: Maybe Int -> Codec [Words8] [String] since the programmer may choose to not wrap lines anyway. But perhaps the line length is best paired with the data base16 :: Codec ([Words8], Maybe Int) [String] so that encode base16 (..., Just 70) x will encode with a line length of 70 characters and let (,ll) = decode base16 s in ... will return the parsed line length in ll . Oh my lambda, it's wondrous how Haskell gives so many possibilities to ponder for such a seemingly innocent API design problem :) Regards, apfelmus