Re: [Haskell-cafe] Re: String vs ByteString

15 Aug 2010


      Quoth John Millikin ,
...
I don't see why [Char] is "obvious" -- you'd never use [Word8] for
storing binary data, right? [Char] is popular because it's the default
type for string literals, and due to simple inertia, but when there's
a type based on packed arrays there's no reason to use the list
representation.
Well, yes, string literals - and pattern matching support, maybe
that's the same thing.  And I think it's fair to say that [Char]
is a natural, elegant match for the language, I mean it leverages
your basic Haskell skills if for example you want to parse something
fairly simple.  So even if ByteString weren't the monumental hassle
it is today for simple stuff, String would have at least a little appeal.
And if packed arrays really always mattered, [Char] would be long gone.
They don't, you can do a lot of stuff with [Char] before it turns into
a problem.
...
Also, despite the name, ByteString and Text are for separate purposes.
ByteString is an efficient [Word8], Text is an efficient [Char] -- use
ByteString for binary data, and Text for...text. Most mature languages
have both types, though the choice of UTF-16 for Text is unusual.
Maybe most mature languages have one or more extra string types
hacked on to support wide characters.  I don't think it's necessarily
a virtue.  ByteString vs. ByteString.Char8, where you can choose
more or less indiscriminately to treat the data as Char or Word8,
seems to me like a more useful way to approach the problem.  (Of
course, ByteString.Char8 isn't a good way to deal with wide characters
correctly, I'm just saying that's where I'd like to find the answer,
not in some internal character encoding into which all "text" data
must be converted.)

	Donn Cave, donn@avvanta.com