A question about laziness and performance in document serialization.

So I am not entirely clear on how to optimize for performance for lazy bytestrings. Currently I have a (Lazy) Map that contains large BSON values (more than 1mb when serialized each). I can serialize BSON documents to Lazy ByteStrings using Data.Binary.runPut. I then write this bytestring to a socket using Network.Socket.ByteString.Lazy. My question is this, if the Map object doesn't change (no updates) when it serializes the same document to the socket 2x in a row, does it re-evaluate the whole BSON value and convert it to a bytestring each time? Lets say I wanted to have a cache of bytestings so I have another Map object that has the serialized bytestrings that I populate it with every time the original BSON Map changes. Should the map be strict or lazy? Should the bytestrings it stores be strict or lazy? Any help in understanding laziness would be appreciated. -- Kyle Hanson

* Kyle Hanson
So I am not entirely clear on how to optimize for performance for lazy bytestrings.
Currently I have a (Lazy) Map that contains large BSON values (more than 1mb when serialized each). I can serialize BSON documents to Lazy ByteStrings using Data.Binary.runPut. I then write this bytestring to a socket using Network.Socket.ByteString.Lazy.
My question is this, if the Map object doesn't change (no updates) when it serializes the same document to the socket 2x in a row, does it re-evaluate the whole BSON value and convert it to a bytestring each time?
Yes.
Lets say I wanted to have a cache of bytestings so I have another Map object that has the serialized bytestrings that I populate it with every time the original BSON Map changes. Should the map be strict or lazy?
This is the wrong question. The right question is, do you want the values be strict (evaluated) or lazy (kept unevaluated until required)? If you want values to be lazy, then you have to use the lazy Map. If you want values to be strict, then you may either use the strict Map, or still use the lazy Map but make sure that the values are evaluated when you place them in the map. Using the strict Map is probably a better idea, but the lazy Map lets you have finer control over what is lazy and what is forced (should you need it). Note that the lazy bytestring is just a lazy list of strict bytestrings. Even placing it in the strict map wouldn't force its evaluation.
Should the bytestrings it stores be strict or lazy?
For a cache, it makes sense to store strict bytestrings (unless they are so large that it may be hard to allocate that much of contiguous space). Lazy bytestrings are useful for streaming, when you use a chunk and then discard it. Using strict bytestrings doesn't imply that you want to store them evaluated. Depending on your circumstances, it may be a good idea to store strict bytestrings lazily, so that they do not take space and time until they are requested for the first time. Simply operating with the words lazy and strict may be very confusing, since they refer to different things in different contexts. Every time you read that something is lazy or strict, try to decipher it in terms of the basic evaluation properties. HTH, Roman
participants (2)
-
Kyle Hanson
-
Roman Cheplyaka