
Jed Brown
Uh, ByteString is Unicode-agnostic. ByteString.Char8 is not. So why not do IO with lazy ByteString and parse into your own representation (which might look a lot like StorableVector)?
One problem you might run into doing it this way is if a wide character is split between two different arrays. In that case you have to do some post-porcessing to put the pieces back together. More efficient, I think, if you could force a given alignment when reading in the lazy bytestring. But there's not a way to do that, is there? I hope this makes sense. It's the problem I ran into when I tried once to use lazy bytestrings instead of a storable vector, reasoning that the more recent fusion work in bytestring would give a speed boost. But then I was doing numerical stuff, and I don't know much about unicode. Chad