
On Tue, Apr 25, 2006 at 02:34:20PM +0100, Simon Marlow wrote:
Duncan Coutts wrote:
How would we distinguish a full fixed0width 4-byte Unicode version?
Good point, and that's why using the Data.PackedString hierarchy was nice, because it accomodated various different character widths. I quite like
Data.ByteString Data.PackedString.Latin1 Data.PackedString.UTF8 Data.PackedString.UCS4 etc.
Do we really need all of these? UCS4BE? UTF16? if you care intimatly about the underlying binary representation, then you should be using ByteString directly, since you are working with binary data. if you just want a fast string replacement, then you don't care about the internal representation, you just want it to be fast. We don't want issues where someones library takes UTF8 strings but someone elses takes UCS4 strings and you want them to play nice together. I think all we really need are Data.ByteString Data.PackedString (Though, I suppose Latin1 could be useful) but note, do the people that want latin1 just need ASCII? because it should be noted that if we have a UTF8 PackedString, then we can make ASCII-specific access routines that are just as fast as the ones in the Latin1 variety without giving up the ability to store full unicode values in the string. John -- John Meacham - ⑆repetae.net⑆john⑈