
Hi Guys, I am trying to parse a binary stream with the format for one entry [headers, zlib compressed content] , with multiple entries. I can use the Zlib library to get the content for the first entry after the headers, but I cannot find a way to get the offset to start parsing for the second entry. Is there a way I can get this information out of ZLib? or is there a better approach to doing this? Any pointers would be very much appreciated. Regards, Rahul

On Wed, 20 Jul 2011, rahul wrote:
Hi Guys, I am trying to parse a binary stream with the format for one entry [headers, zlib compressed content] , with multiple entries. I can use the Zlib library to get the content for the first entry after the headers, but I cannot find a way to get the offset to start parsing for the second entry. Is there a way I can get this information out of ZLib? or is there a better approach to doing this? Any pointers would be very much appreciated.
As far as I know, these are compressors for single files. Multiple files can be compressed in connection with TAR, that can be manipulated from Haskell using the 'tar' package.

Hi, | > I am trying to parse a binary stream with the format for one entry | >[headers, zlib compressed content] , with multiple entries. | >I can use the Zlib library to get the content for the first entry after | >the headers, but I cannot find a way to get the offset to start parsing | >for the second entry. Is there a way I can get this information out of | >ZLib? or is there a better approach to doing this? Any pointers would be | >very much appreciated. | | As far as I know, these are compressors for single files. Multiple | files can be compressed in connection with TAR, that can be | manipulated from Haskell using the 'tar' package. Unfortunately the binary protocol itself is external, so can't use a different type of compression. rahul -- http://people.oregonstate.edu/~gopinatr/

On Wed, Jul 20, 2011 at 11:50 AM, rahul
Unfortunately the binary protocol itself is external, so can't use a different type of compression
Perhaps something like this would work: https://gist.github.com/1096039 I didn't test to make sure it works, but you could probably hack together a working solution using Data.Enumerator.Binary.isolate and the zlib-enum package. -n

Hi Nathan, Thank you very for the solution, since I am somewhat new to haskell, I am taking some time to digest it :). But it seems that you are using header -> streamLength to find the length of a single entry. However this info is not present in the protocol I am parsing (git server pack files) Have I understood your code correctly? | > Unfortunately the binary protocol itself is external, so can't use a | > different | > type of compression | > | | Perhaps something like this would work: https://gist.github.com/1096039 | | I didn't test to make sure it works, but you could probably hack together a | working solution using Data.Enumerator.Binary.isolate and the zlib-enum | package. | | -n ---~*~---

It was purely just for demonstration. I did update the code with a few more
comments, but the enumerator package may not be the easiest thing to grok.
You might try putting up your current code and someone might be able to
recommend a better or easier approach.
If the git pack headers have lengths in them, you could do something as
simple as calling hSeek to move a file handle to the next header and start
your decoding over again.
On Wed, Jul 20, 2011 at 10:24 PM, rahul
Hi Nathan, Thank you very for the solution, since I am somewhat new to haskell, I am taking some time to digest it :). But it seems that you are using header -> streamLength to find the length of a single entry. However this info is not present in the protocol I am parsing (git server pack files)
Have I understood your code correctly?
| > Unfortunately the binary protocol itself is external, so can't use a | > different | > type of compression | > | | Perhaps something like this would work: https://gist.github.com/1096039 | | I didn't test to make sure it works, but you could probably hack together a | working solution using Data.Enumerator.Binary.isolate and the zlib-enum | package. | | -n ---~*~---

Hi Nathan, | It was purely just for demonstration. I did update the code with a few more | comments, but the enumerator package may not be the easiest thing to grok. | You might try putting up your current code and someone might be able to | recommend a better or easier approach. Thank you very much again, I am working on extracting a leaner version of my parser that I can post to demonstrate the problem. | If the git pack headers have lengths in them, you could do something as | simple as calling hSeek to move a file handle to the next header and start | your decoding over again. That is the unfortunate part, git pack headers have the inflated length rather than the entry length. So that length is unusable for finding the next entry start. | > Thank you very for the solution, since I am somewhat new to haskell, I | > am taking some time to digest it :). But it seems that you are using | > header -> streamLength to find the length of a single entry. However this | > info is not present in the protocol I am parsing (git server pack files) | > | > Have I understood your code correctly? | > | > | > Unfortunately the binary protocol itself is external, so can't use a | > | > different | > | > type of compression | > | > | > | | > | Perhaps something like this would work: https://gist.github.com/1096039 | > | | > | I didn't test to make sure it works, but you could probably hack together | > a | > | working solution using Data.Enumerator.Binary.isolate and the zlib-enum | > | package. | > | | > | -n | > ---~*~--- | > | > ---~*~---
participants (3)
-
Henning Thielemann
-
Nathan Howell
-
rahul