
Andrew Coppin wrote:
Now, as I understand it, a ByteString is a kind of unboxed array (= big RAM savings + big CPU time savings for not building it + big GC savings for not processing millions of list nodes + better cache performance). Or at least, a *strict* ByteString is; I'm very very fuzzy on exactly how a *lazy* ByteString is any different to a normal list. From my reading today, I take it the only real difference is that one is a linked list, whereas the other is a (boxed?) array, so it's smaller. (?)
As I understand it (wich may or may not be correct): A normal Haskell string is basically [Word8] A strict ByteString ist basically UArray Int Word8 A lazy ByteString is basically [UArray Int Word8] [Word8] is processed one byte at once UArray Int Word8 is processed all at once [UArray Int Word8] is processed a chunk of bytes at once So loading and processing [Word8] corresponds to while (!eof(file)) { Byte current = readByteFromFile(file); processByte(current); } loading and processing a strict ByteString corresponds to int size = readWholeFileIntoBuffer(file, buffer); for (int i = 0; i < size; i++) processByte(buffer[i]); and loading and processing a lazy ByteString corresponds to while (!eof(file)) { int size = readNextNBytesIntoBuffer(file, buffer, buffersize); for (int i = 0; i < size; i++) processByte(buffer[i]); } in a C-like language. The first is nonsense because of I/O overhead, the second is fine for small files only and the third combines the best of both worlds. Unlike the C examples, Haskell lists face not only I/O overhead, but general overhead of boxed representation and laziness, too. But even in C, the third solution ist the best, but some programs don't use it. So Bytestring powered Haskell programs are able to outperform some C programs. The most important contributions of the ByteStrings library seem to be these: (1) unify lazy and strict Bytestrings interface-wise and (2) allow idiomatic Haskell programs using lazy ByteStrings to be compiled to something like the third c program above. Tillmann