
Hi all, Recently I am considering doing part of my job using Haskell. My duty is writing a network server which talks to another server through a binary based private protocol. As the old version of this component is written in C, it's very natural that this protocol is base on C structure definitions, which are, unfortunately, very complicated. And the worse is that every field in every structure must be converted to Network Endian. As I am a newbie to Haskell, I am not sure how to handle this problem with less work. Do you have any ideas about this problem? Thanks in advance! -- There is No CODE That is More Flexible Than NO Code!

On 8/18/07, Peter Cai
As the old version of this component is written in C, it's very natural that this protocol is base on C structure definitions, which are, unfortunately, very complicated. And the worse is that every field in every structure must be converted to Network Endian.
You could certainly try Data.Binary[1] for this. It has a nice Get monad with methods such as getWord32be which sounds like it might be what you want. One caveat is that it's fully lazy - you get the result immediately and parse errors can only be caught as exceptions when you actually come to using the result. This is perfect for very large messages, but might be slightly wrong for you. [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/binary-0.3 -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org 650-283-9641

Adam Langley mused:
On 8/18/07, Peter Cai
wrote: As the old version of this component is written in C, it's very natural that this protocol is base on C structure definitions, which are, unfortunately, very complicated. And the worse is that every field in every structure must be converted to Network Endian.
You could certainly try Data.Binary[1] for this. It has a nice Get monad with methods such as getWord32be which sounds like it might be what you want. One caveat is that it's fully lazy - you get the result immediately and parse errors can only be caught as exceptions when you actually come to using the result. This is perfect for very large messages, but might be slightly wrong for you.
Also, one thing to watch out for is the fact the existing Get and Put instances may not do anything like what you expect. For example, for some reason I expected that the instances of Get and Put for Float and Double would send across the wire Floats and Doubles in IEEE floating point standard. How wrong I was... But in general yes, Data.Binary is pretty useful for doing network protocols. Matthew

On 8/18/07, Matthew Sackman
Also, one thing to watch out for is the fact the existing Get and Put instances may not do anything like what you expect. For example, for some reason I expected that the instances of Get and Put for Float and Double would send across the wire Floats and Doubles in IEEE floating point standard. How wrong I was...
Ah, those aren't instances of Get and Put, but of Binary[1]. You use the Binary instances via the functions 'get' and 'put' (case is important). Get and Put provide actions like "putWord32be", for which the resulting bits are pretty much universally accepted. Binary has default instances which uses Get and Put to serialise Haskell types like [Int], or (Float, Float). Here the resulting bits aren't documented, but you can read the code and I have some C code for dealing with them somewhere if anyone is interrested. The serialisation of Float is, indeed, nothing like IEEE in either endianness. (* and, although Get isn't currently a class, I have sent patches to dons to make it so, with a default instance which matches current behaviour and speed, and an alternative which returns a Maybe, removing a little bit of lazyness in cases where you want to handle parse failures in pure code. Hopefully something will happen with this at the next sprint ;) ) [1] http://www.cse.unsw.edu.au/~dons/binary/Data-Binary.html#1 -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org 650-283-9641

Recently, Adam Langley responded so:
On 8/18/07, Matthew Sackman
wrote: Also, one thing to watch out for is the fact the existing Get and Put instances may not do anything like what you expect. For example, for some reason I expected that the instances of Get and Put for Float and Double would send across the wire Floats and Doubles in IEEE floating point standard. How wrong I was...
Ah, those aren't instances of Get and Put, but of Binary[1]. You use the Binary instances via the functions 'get' and 'put' (case is important).
Gah, that'll teach me to post from memory without checking the code. Indeed, that is what I meant, the instances of Binary.
Get and Put provide actions like "putWord32be", for which the resulting bits are pretty much universally accepted. Binary has default instances which uses Get and Put to serialise Haskell types like [Int], or (Float, Float). Here the resulting bits aren't documented, but you can read the code and I have some C code for dealing with them somewhere if anyone is interrested. The serialisation of Float is, indeed, nothing like IEEE in either endianness.
Quite. Whilst we're on the subject (and I realise I might be hijacking this thread a little), it does seem rather odd that it's very easy to take a Word8/16/32/64 and interpret it as an integer. Similarly, it's very easy to take an integer and convert it to a Word of some sort. But it's vastly harder to do that for floats / non-integers. Now I know that the number classes in the Prelude are basically broken anyway and all really need rewriting, but it does seem completely arbitrary that Words somehow are only allowed to contain whole numbers! Matthew

On 8/19/07, Matthew Sackman
But it's vastly harder to do that for floats / non-integers. Now I know that the number classes in the Prelude are basically broken anyway and all really need rewriting, but it does seem completely arbitrary that Words somehow are only allowed to contain whole numbers!
Well, see the attached patch to Data.Binary to add putFloat[32|64][be|le]. I got bored, so adding the Get functions is an exercise for the reader :) (And so because I think it needs unsafeSomethingIO and I'm a little unsure about that). If these functions would be useful for you, you should bug the binary team to add something similar. AGL -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org 650-283-9641

On 2007-08-19, Matthew Sackman
Recently, Adam Langley responded so:
On 8/18/07, Matthew Sackman
wrote: Also, one thing to watch out for is the fact the existing Get and Put instances may not do anything like what you expect. For example, for some reason I expected that the instances of Get and Put for Float and Double would send across the wire Floats and Doubles in IEEE floating point standard. How wrong I was...
Ah, those aren't instances of Get and Put, but of Binary[1]. You use the Binary instances via the functions 'get' and 'put' (case is important).
Gah, that'll teach me to post from memory without checking the code. Indeed, that is what I meant, the instances of Binary.
Get and Put provide actions like "putWord32be", for which the resulting bits are pretty much universally accepted. Binary has default instances which uses Get and Put to serialise Haskell types like [Int], or (Float, Float). Here the resulting bits aren't documented, but you can read the code and I have some C code for dealing with them somewhere if anyone is interrested. The serialisation of Float is, indeed, nothing like IEEE in either endianness.
Quite. Whilst we're on the subject (and I realise I might be hijacking this thread a little), it does seem rather odd that it's very easy to take a Word8/16/32/64 and interpret it as an integer. Similarly, it's very easy to take an integer and convert it to a Word of some sort.
That's because there's basically only one way to interpret a given word as an integer, and store a given integer as a word.
But it's vastly harder to do that for floats / non-integers. Now I know that the number classes in the Prelude are basically broken anyway and all really need rewriting, but it does seem completely arbitrary that Words somehow are only allowed to contain whole numbers!
It's more that for floats, there are a zillion plausible ways to store them, and many have been used. -- Aaron Denney -><-

As I am a newbie to Haskell, I am not sure how to handle this problem with less work. Do you have any ideas about this problem? Thanks in advance!
Have a look at http://haskell.org/haskellwiki/Applications_and_libraries/Data_structures section 3 (IO) -> http://haskell.org/haskellwiki/Binary_IO Of course you can just use most different parser libraries as well, because most are not tight to one token type.. So you shouldn't have any trouble parsing a ByeSttring which is a char (8bit word) buffer. I'd recommend having a look at ParseP or happy/ alex .. if the binary libraries aren't suited for your task.. But to get the fastest/ whatsoever solution you should wait for different replies as I haven't used all those yet to parse binary data.. Sincerly Marc Weber

marco-oweber:
As I am a newbie to Haskell, I am not sure how to handle this problem with less work. Do you have any ideas about this problem? Thanks in advance!
Have a look at http://haskell.org/haskellwiki/Applications_and_libraries/Data_structures section 3 (IO) -> http://haskell.org/haskellwiki/Binary_IO
Of course you can just use most different parser libraries as well, because most are not tight to one token type.. So you shouldn't have any trouble parsing a ByeSttring which is a char (8bit word) buffer.
I'd recommend having a look at ParseP or happy/ alex .. if the binary libraries aren't suited for your task..
But to get the fastest/ whatsoever solution you should wait for different replies as I haven't used all those yet to parse binary data..
I'd also recommend Data.Binary for this -- parsing C-friendly protocols off the wire, at high speed, is the main purpose for Data.Binary. -- Don

On Sun, 19 Aug 2007, Peter Cai wrote:
My duty is writing a network server which talks to another server through a binary based private protocol.
Haskell needs something like Erlang's bit syntax.
http://erlang.org/doc/reference_manual/expressions.html#6.16
http://erlang.org/doc/programming_examples/bit_syntax.html#4
The IP header example in the latter is a brilliant real-world example.
It has recently been upgraded to support arbitrary bit streams.
See http://www.it.uu.se/research/group/hipe/papers/padl07.pdf
Tony.
--
f.a.n.finch

dot:
On Sun, 19 Aug 2007, Peter Cai wrote:
My duty is writing a network server which talks to another server through a binary based private protocol.
Haskell needs something like Erlang's bit syntax.
http://erlang.org/doc/reference_manual/expressions.html#6.16 http://erlang.org/doc/programming_examples/bit_syntax.html#4 The IP header example in the latter is a brilliant real-world example.
It has recently been upgraded to support arbitrary bit streams. See http://www.it.uu.se/research/group/hipe/papers/padl07.pdf
Yes, we've looked at this in the context of Data.Binary. Rather than extending the core syntax, on option is to use Template Haskell, http://hackage.haskell.org/cgi-bin/hackage-scripts/package/BitSyntax-0.3 Another is to just use monad and pattern guards, which give quite reasonable syntax. -- Don

* Tony Finch wrote:
http://erlang.org/doc/programming_examples/bit_syntax.html#4 The IP header example in the latter is a brilliant real-world example.
Unfortunly this example does not handle bit and byte order. Take a look at Ada's representation clauses for such topics.

On Wed, 22 Aug 2007, Lutz Donnerhacke wrote:
* Tony Finch wrote:
http://erlang.org/doc/programming_examples/bit_syntax.html#4 The IP header example in the latter is a brilliant real-world example.
Unfortunly this example does not handle bit and byte order. Take a look at Ada's representation clauses for such topics.
Erlang has support for byte endianness but not (it seems) bit endianness.
I'm currently kicking up a fuss about this on the erlang-questions list,
since while Erlang's bitwise big-endian layout works OK for network
protocols, it fails for typical little-endian C structures with bit
fields.
Thanks for the pointer to Ada.
Tony.
--
f.a.n.finch
participants (8)
-
Aaron Denney
-
Adam Langley
-
dons@cse.unsw.edu.au
-
Lutz Donnerhacke
-
Marc Weber
-
Matthew Sackman
-
Peter Cai
-
Tony Finch