
2010/11/9 C K Kashyap
Thanks Stephen,
On Tue, Nov 9, 2010 at 2:53 PM, Stephen Tetley
wrote: I'd use a parser combinator library that has word8 word16, word32 combinators. The latter should really have big and little endian versions word16be, word16le, word32be, word32le.
Data.Binary should provide this and Attoparsec I think. Usually I roll my own, but only because I had my own libraries before these two existed.
The idiom of a tag byte telling you what comes next is very common in binary formats. It means parsers can avoid backtracking altogether.
I'll take a look at attoparsec
I was also trying to understand how I could do it myself also -
Basically I've been using the Get Monad for getting out the word/8/16 etc out of a ByteStream - but I dont want to write a separate parsing routine for each command.
So instead of doing something like this -
parseCommand1 byteStream = runGet $ do b1 <- getWord8 b2 <- getWord16be return (b1,b2)
parseCommand2 byteStream = runGet $ do b1 <- getWord16be b2 <- getWord16be return (b1,b2)
I'd like to do this
parse byteStream command = runGet $ do map (commandFormat command) --- or something like this - not exactly sure about this.
Hi, This doesn't seem a good idea to me. In the first case, when you have parsed your data, you end up with very specific data structures that can be processed later as-is. In the second case, you end up with a list for every kind of data, so you're bound to "parse" that list again to know what you're dealing with. In the first case, parsing wrong data is the only way to fail and you produce solid data you can work with. In the second case, you have a very weak representation that will need more work afterward, and that work is very similar to the parsing you do in the first place. Cheers, Thu