
On Thu, 2009-12-24 at 18:18 -0500, Antoine Latter wrote:
Folks,
I found some of the documentation in GHC.Prim confusing - so I thought I'd share. The documentation for the ByteArray# type[1] explains that's it's a raw region in memory that also remembers it's size.
Consequently I expected sizeOfByteArray# to return the same number that I passed in to newByteArray#. But it doesn't - It returned however much it decided to allocate, which on my platform is always a multiple of four bytes.
Yes, this is an artefact of the fact that ghc measures heap stuff in units of words.
This is something which could be clarified in the documentation.
It would be jolly useful for making short strings for GHC's ByteArray# to to use a byte length rather than a word length. It'd mean a little more bit twiddling in the GC code that looks at ByteArray#s, however it'd save an extra 2 words in a short string type (or allow us to store '\0' characters in short strings). It's been on my TODO list for some time to design a portable low level ByteArray module that could be implemented by hugs, nhc, ghc, etc. The aim would be to be similar to ForeignPtr + Storable but using native heap allocated memory blocks. In turn this would be the right portable layer on which to build ByteString, Text and probably IO buffers too. Duncan