Re: [Haskell-cafe] representation on persistent store question

On Jan 1, 2009 11:50pm, "Galchin, Vasili"
it is a bioinformatics standard .. . I am writing on this newsgroup in order to try to be objective to get a "correct" and elegant answer .. in any case I am helping on the bioinformatics code (you can see on Hackage). I am trying to finish the 2Bit file format code ... it seems to me that bioinformatics as an area is not clearly defined .... e.g. it is unclear to me whether "offset" is a marshalled/serialized concept or or unmarshalled/unserialized concept ..... this distinction is very important .... I will have to think about more myself!
Regards, Vasili
Here's some code using Data.Binary to store data as offsets into a byte array. I haven't tested it too much, so it may have bugs. Maybe there's some inspiration in there. -Antoine
import Data.Binary import Data.Binary.Get import Data.Binary.Put
import Data.ByteString.Lazy (ByteString) import qualified Data.ByteString.Lazy as B data TestStruct = TestStruct { property1 :: ByteString , property2 :: ByteString , property3 :: ByteString } deriving Show {- The serialized format looks like (all big-endian): * first offset into data block (Word32) * second offset into data block (Word32) * third offset into data block (Word32) * length of bnary data block (Word32) * binary data block (Arbitrary binary data) -} instance Binary TestStruct where put struct = let data1 = property1 struct data2 = property2 struct data3 = property3 struct dataBlock = data1 `B.append` data2 `B.append` data3 offset1 = 0 offset2 = offset1 + B.length data1 offset3 = offset2 + B.length data2 in do putWord32be $ fromIntegral offset1 putWord32be $ fromIntegral offset2 putWord32be $ fromIntegral offset3 putWord32be $ fromIntegral $ B.length dataBlock putLazyByteString dataBlock get = do offset1 <- getWord32be offset2 <- getWord32be offset3 <- getWord32be dataBlockLength <- getWord32be dataBlock <- B.drop (fromIntegral offset1) `fmap` getLazyByteString (fromIntegral dataBlockLength) let (data1, rest1) = B.splitAt (fromIntegral $ offset2 - offset1) dataBlock (data2, rest2) = B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1 data3 = rest2 return $ TestStruct data1 data2 data3 <<<<<

dude .. you rock ... let me check it out ;^)
Vasili
On Fri, Jan 2, 2009 at 12:24 AM, Antoine Latter
it is a bioinformatics standard .. . I am writing on this newsgroup in order to try to be objective to get a "correct" and elegant answer .. in any case I am helping on the bioinformatics code (you can see on Hackage). I am
On Jan 1, 2009 11:50pm, "Galchin, Vasili"
wrote: trying to finish the 2Bit file format code ... it seems to me that bioinformatics as an area is not clearly defined .... e.g. it is unclear to me whether "offset" is a marshalled/serialized concept or or unmarshalled/unserialized concept ..... this distinction is very important .... I will have to think about more myself! Regards, Vasili
Here's some code using Data.Binary to store data as offsets into a byte array. I haven't tested it too much, so it may have bugs. Maybe there's some inspiration in there.
-Antoine
import Data.Binary import Data.Binary.Get import Data.Binary.Put
import Data.ByteString.Lazy (ByteString) import qualified Data.ByteString.Lazy as B
data TestStruct = TestStruct { property1 :: ByteString , property2 :: ByteString , property3 :: ByteString } deriving Show
{-
The serialized format looks like (all big-endian):
* first offset into data block (Word32) * second offset into data block (Word32) * third offset into data block (Word32) * length of bnary data block (Word32) * binary data block (Arbitrary binary data)
-} instance Binary TestStruct where put struct = let data1 = property1 struct data2 = property2 struct data3 = property3 struct
dataBlock = data1 `B.append` data2 `B.append` data3
offset1 = 0 offset2 = offset1 + B.length data1 offset3 = offset2 + B.length data2
in do putWord32be $ fromIntegral offset1 putWord32be $ fromIntegral offset2 putWord32be $ fromIntegral offset3
putWord32be $ fromIntegral $ B.length dataBlock putLazyByteString dataBlock
get = do offset1 <- getWord32be offset2 <- getWord32be offset3 <- getWord32be
dataBlockLength <- getWord32be dataBlock <- B.drop (fromIntegral offset1) `fmap` getLazyByteString (fromIntegral dataBlockLength)
let (data1, rest1) = B.splitAt (fromIntegral $ offset2 - offset1) dataBlock (data2, rest2) = B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1 data3 = rest2
return $ TestStruct data1 data2 data3 <<<<<
participants (2)
-
Antoine Latter
-
Galchin, Vasili