Re: [Haskell-cafe] Re: String vs ByteString

18 Aug 2010

      On Tue, Aug 17, 2010 at 12:30, Donn Cave  wrote:
...
If Haskell had the development resources to make something like this
work, would it actually take the form of a Haskell-level type like
that - data Text = (Encoding, ByteString)?  I mean, I know that's
just a very clear and convenient way to express it for the purposes
of the present discussion, and actual design is a little premature -
... but, I think you could argue that from the Haskell level,
`Text' should be a single type, if the encoding differences aren't
semantically interesting.
It should be possible to create a Ruby-style Text in Haskell, using
the existing Text API. The constructor would be something like << data
Text = Text !Encoding !ByteString >>, but there's no need to export
it. The only significant improvements, performance-wise, would be that
1) "encoding" text to its internal encoding would be O(1) and 2)
"decoding" text would only have to perform validation, instead of
validation+copy+stream fusion muck. Downside: lazy decoding makes it
very difficult to reason about failures, since even simple operations
like 'append' might fail if you try to append two texts with
mutually-incompatible characters.

In any case, I suspect getting Haskell itself to support non-Unicode
characters is much more difficult than writing an appropriate Text
type.

Re: [Haskell-cafe] Re: String vs ByteString

John Millikin