Re: portable encoding/decoding without going via a handle

Hi Judah, On 25/11/2012 17:19, Judah Jacobson wrote:
I think some way to use BufferCodecs without going through the filesystem would be very useful. One other approach would be Bytestring-backed Handles; there was talk of them in the past, but I don't know of any actual packages for it yet. That might be a simpler approach than manipulating BufferCodecs directly, since you could just use the functions from Data.Text.IO http://Data.Text.IO and all of the buffering and error recovery would get taken care of automatically.
This 2009 email from Simon Marlow references bytestring-backed handles and has a code sample for memory-mapped files that might be helpful: http://www.haskell.org/pipermail/glasgow-haskell-users/2009-December/018124.... He also mentions bytestring-backed Handles in this talk: http://community.haskell.org/~simonmar/GHC-IO.pdf That code's probably bitrotted a little, but seems like a good place to start.
Thanks for the pointers! Do you have any thoughts on what the API for creating a bytestring-backed handle should be? I'm particularly thinking of the case where we are writing to the bytestring - the type could be something like makeWritableByteStringHandle :: IO (Handle, ByteString) but then would we end up with a ByteString value that could be being mutated in parallel with being used. It might be nicer to have it in two phases, e.g. makeWritableByteStringHandle :: IO Handle finishByteStringHandle :: Handle -> IO ByteString but since Handle is a single type rather than a type class, that's not imlpementable. Perhaps: makeWritableByteStringHandle :: IO (Handle, IO ByteString) where the embedded IO action is only valid after the Handle has been hClose'd? Cheers, Ganesh

On Tue, Nov 27, 2012 at 11:28 PM, Ganesh Sittampalam
Hi Judah,
On 25/11/2012 17:19, Judah Jacobson wrote:
I think some way to use BufferCodecs without going through the filesystem would be very useful. One other approach would be Bytestring-backed Handles; there was talk of them in the past, but I don't know of any actual packages for it yet. That might be a simpler approach than manipulating BufferCodecs directly, since you could just use the functions from Data.Text.IO http://Data.Text.IO and all of the buffering and error recovery would get taken care of automatically.
This 2009 email from Simon Marlow references bytestring-backed handles and has a code sample for memory-mapped files that might be helpful:
http://www.haskell.org/pipermail/glasgow-haskell-users/2009-December/018124....
He also mentions bytestring-backed Handles in this talk: http://community.haskell.org/~simonmar/GHC-IO.pdf That code's probably bitrotted a little, but seems like a good place to start.
Thanks for the pointers!
Do you have any thoughts on what the API for creating a bytestring-backed handle should be? I'm particularly thinking of the case where we are writing to the bytestring - the type could be something like
makeWritableByteStringHandle :: IO (Handle, ByteString)
but then would we end up with a ByteString value that could be being mutated in parallel with being used. It might be nicer to have it in two phases, e.g.
makeWritableByteStringHandle :: IO Handle finishByteStringHandle :: Handle -> IO ByteString
but since Handle is a single type rather than a type class, that's not imlpementable. Perhaps:
makeWritableByteStringHandle :: IO (Handle, IO ByteString)
where the embedded IO action is only valid after the Handle has been hClose'd?
How about something like this? createFromHandle :: (Handle -> IO ()) -> IO ByteString That would follow the pattern of the create.* functions from Data.ByteString.Internal, e.g. create :: Int -> (Ptr Word8 -> IO ()) -> IO ByteString -Judah

On 28/11/2012 08:41, Judah Jacobson wrote:
How about something like this?
createFromHandle :: (Handle -> IO ()) -> IO ByteString
Of course, thanks! I've knocked something up at http://hub.darcs.net/ganesh/bytestring-handle The signatures are: readHandle :: Bool -> BL.ByteString -> IO Handle writeHandle :: Bool -> (Handle -> IO a) -> IO (BL.ByteString, a) I went for lazy bytestrings as they fit my use case and mostly generalise strict ones in this context - one exception is that writing directly to a strict one could avoid a copy if you know the max size up front. Comments etc welcome. I've written some tests but not very comprehensive ones and I wouldn't be at all surprised if the seek behaviour is completely broken. Cheers, Ganesh
participants (2)
-
Ganesh Sittampalam
-
Judah Jacobson