
How would one implement an CStringLen-style type that (a) was efficient, in particular could be read and written to Handles efficiently; (b) would get automatically deallocated by the Haskell garbage collector, when Haskell no longer referred to it; (c) was immutable, so that once created the character data could not be changed; (d) consequently had the property that conversion to and from String would be a pure operation; (e) could be passed to and from C using the FFI. (Of course C would need you to split the length and character data up; the character data would presumably have type "const char *".) ? It would be rather nice to have such a type.
Data.PackedString is *almost* what you want, and could be tweaked to do the right thing (at least in GHC). There are two problems: (1) the representation is currently as an array of 32-bit unicode chars, whereas you probably want 8-bit ISO-8859 or something. (2) Passing to FFI functions: to make this work you can use pinned byte-arrays instead of ordinary byte-arrays to store the string, and an explicit touch# after the FFI call. Cheers, Simon

Simon Marlow wrote (snipped)
Data.PackedString is *almost* what you want, and could be tweaked to do the right thing (at least in GHC).
As a matter of fact, my first attempt used Data.PackedString, until my code fell over because of the hPutPS (or was it hGetPS) bug I reported recently.
There are two problems: (1) the representation is currently as an array of 32-bit unicode chars, whereas you probably want 8-bit ISO-8859 or something. Also it seems that hPutPS insists on constructing a String as a half-way stage, which doesn't seem very efficient. In my particular application I don't much care if writing very short strings is inefficient, but I do very much care that writing long strings should be efficient.
(2) Passing to FFI functions: to make this work you can use pinned byte-arrays instead of ordinary byte-arrays to store the string, and an explicit touch# after the FFI call.
I am grateful for Alastair Reid's solution, but it seems too complicated. In particular, I really don't want to have to write C code to take the structure apart and reassemble it again, and I don't think I need the reference counts. So instead what I've done is implement data ICStringLen = ICStringLen (ForeignPtr CChar) Int and functions mkICStringLen :: Int -> (CString -> IO()) -> IO ICStringLen withICStringLen :: ICStringLen -> (Int -> CString -> IO a) -> IO a which can be implemented easily enough, and are pretty much all that is required for my limited application. All the same I think there is a case for having immutable CStrings, and similar things, more widely available. For example, it's annoying having to remember to manually free things (and indeed work out what variety of "free" to use), and it would not surprise me if this turns out to be a major source of bugs in the future. It seems to me that immutable CStrings ought also to be a useful way of storing large quantities of (ASCII or UTF8-encoded) character data.

| How would one implement an CStringLen-style type that | [...] Take a look at: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/afp/Lab2/Chunk.hs Is that what you want? /Koen

| [...]
Take a look at:
http://www.cs.chalmers.se/Cs/Grundutb/Kurser/afp/Lab2/Chunk.hs
Is that what you want? Yes, that's much more like it. Any chance of this module getting put into
Koen Claessen wrote:> | How would one implement an CStringLen-style type that the standard distribution?
participants (3)
-
George Russell
-
Koen Claessen
-
Simon Marlow