
On Wed, 2007-02-28 at 09:51 -0800, Bryan O'Sullivan wrote:
Bulat Ziganshin wrote:
can you please provide examples of such code?
I'd recommend taking a look at the new binary package. It's very cleanly written, and mostly easy to understand. It's also easy to see where the optimisations have been made. The only part that might induce sudden cranial expansion is the Builder monoid, which constructs a big delayed computation to efficiently fill out a lazy ByteString when the Put monad is run.
The optimisations haven't perverted the readability of the code much, but it's still quite fast. I've clocked end-to-end serialisation and deserialisation over an Infiniband network at 234 MB/sec (~25% of line rate). This consumed about 90% of one CPU on the sending side, while the receiving side was 100% pegged.
Yes, we've optimised the writing side much more than the reading side at the moment. I'm sure it's possible to get the reading up to at least the same speed. I think we ought to be able to push both further after that because we're still not doing the bounds check commoning up transformation that we'd planned on (more GHC rules). So given enough hacking time there's more performance available. In our benchmarks reading is currently about 40% of writing speed and we're topping out at about 30-40% of maximum memory bandwidth for writes. Apparently that's only 200x faster than the faster of two common python serialisation libs, so we've got some way to go yet. Duncan