dude .. you rock ... let me check it out ;^)

Vasili


On Fri, Jan 2, 2009 at 12:24 AM, Antoine Latter <aslatter@gmail.com> wrote:
On Jan 1, 2009 11:50pm, "Galchin, Vasili" <vigalchin@gmail.com> wrote:
> it is a bioinformatics standard .. . I am writing on this newsgroup in order to try to be objective to get a "correct" and elegant answer .. in any case I am helping on the bioinformatics code (you can see on Hackage). I am trying to finish the 2Bit file format code ... it seems to me that bioinformatics as an area is not clearly defined .... e.g. it is unclear to me whether "offset" is a marshalled/serialized concept or or unmarshalled/unserialized concept ..... this distinction is very important .... I will have to think about more myself!
>
>
> Regards, Vasili
>


Here's some code using Data.Binary to store data as offsets into a
byte array.  I haven't tested it too much, so it may have bugs.  Maybe
there's some inspiration in there.

-Antoine

>>>>
import Data.Binary
import Data.Binary.Get
import Data.Binary.Put

import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy as B

data TestStruct = TestStruct
   { property1 :: ByteString
   , property2 :: ByteString
   , property3 :: ByteString
   }
 deriving Show

{-

 The serialized format looks like (all big-endian):

 * first offset into data block (Word32)
 * second offset into data block (Word32)
 * third offset into data block (Word32)
 * length of bnary data block (Word32)
 * binary data block (Arbitrary binary data)

-}
instance Binary TestStruct where
   put struct =
       let data1 = property1 struct
           data2 = property2 struct
           data3 = property3 struct

           dataBlock = data1 `B.append` data2 `B.append` data3

           offset1 = 0
           offset2 = offset1 + B.length data1
           offset3 = offset2 + B.length data2

      in do
        putWord32be $ fromIntegral offset1
        putWord32be $ fromIntegral offset2
        putWord32be $ fromIntegral offset3

        putWord32be $ fromIntegral $ B.length dataBlock
        putLazyByteString dataBlock

   get = do
     offset1 <- getWord32be
     offset2 <- getWord32be
     offset3 <- getWord32be

     dataBlockLength <- getWord32be
     dataBlock <- B.drop (fromIntegral offset1) `fmap`
                  getLazyByteString (fromIntegral dataBlockLength)

     let (data1, rest1) =
             B.splitAt (fromIntegral $ offset2 - offset1) dataBlock
         (data2, rest2) =
             B.splitAt (fromIntegral $ offset3 - offset2 - offset1) rest1
         data3          = rest2

     return $ TestStruct data1 data2 data3
<<<<<