
What about the Data.Binary module from the Hackage database? I can call C, no problem, but I hate to do something that's already been done. On Wed, 2007-06-20 at 12:02 +1000, Donald Bruce Stewart wrote:
jeff:
I've read the documentation for some of the marshalling packages out there for Haskell, and I'm left confused as to which one I should be using and how to actually do what I want to do. I have a file, a little over 2gb, of packed data in the format
(recordcount) records of:
4-byte int (count), (count) 2-byte unsigned shorts, (count) 4-byte floats
all in little-endian order. What I want to do is read each record (lazily), and unpack it into Data.IntMap.IntMap Float where the unsigned shorts become the keys and the 4-byte floats become the values.
Then I want to do a lot of interesting processing which we'll skip here, and write back out packed data to a file in the format of
4-byte float, 4-byte float, 4-byte float
for each record. I need these output records to be four-byte C floats. I've gotten as far as datatypes and a couple of signatures, but I can't figure out the functions themselves that go with the signatures, and then again, maybe I have the signatures wrong.
-- import qualified Data.IntMap as M import qualified Data.ByteString.Lazy.Char8 as B
data InputRecord = M.IntMap Float data OutputRecord = (Float, Float, Float)
-- open a file as a lazy ByteString and break up the individual records -- by reading the count variable, reading that many bytes times -- sizeof short + sizeof float into a lazy ByteString. readRawRecordsFromFile :: String -> IO [B.ByteString]
-- take a bytestring as returned by readRawRecordsFromFile and turn it -- into a map. decodeRawRecord :: B.ByteString -> M.IntMap Float --
Can anyone help with how to construct these functions? I'm going to have to make a few passes over this file, so I'd like the IO to be as fast as Haskelly possible.
-- Jeff
Data.ByteString.Lazy.Char8.readFile should suffice for the IO. then use drop/take to split up the file in pieces if you know the length of each field.
For converting ByteString chunks to Floats, I'd probably call C for that.
-- Don