Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

20 Nov 2007

      On Mon, 2007-11-19 at 20:06 -0600, Nicolas Frisby wrote:
...
In light of this discussion, I think the "fully spine-strict list
instance does more good than bad" argument is starting to sound like a
premature optimization. Consequently, using a newtype to treat the
necessarily lazy instances as special cases is an inappropriate
bandaid.
I agree.
...
My current opinion: If Data.Binary makes both a fully strict list
instance (not []) and a fully lazy list instance (this would be the
default for []) available, then that will also make available all of
the other intermediate strictness. I'll elaborate that a bit. If the
user defines a function appSpecificSplit :: MyDataType -> [StrictList
a], then the user can control the compactness and laziness of the
serialisation by tuning that splitting function. Niel's 255 schema
fits as one particular case, the split255 :: [a] -> [StrictList a]
function. I would hesitate to hard code a number of elements, since it
certainly depends on the application and only exposing it as a
parameter maximizes the reusability of the code.
Fully lazy is the wrong default here I think. But fully strict is also
not right. What would fit best with the style of the rest of the
Data.Binary library is to be lazy in a lumpy way. This can give
excellent performance where as being fully lazy cannot (because the
chunk size becomes far too small which increases the overhead).

Has anyone actually said they want the list serialisation to be fully
lazy? Is there a need for anything more than just not being fully
strict? If there is, I don't see it. If it really is needed it can be
added just by flushing after serialising each element.
...
"Reaching for the sky" idea: Does the Put "monad" offer enough
information for an instance to be able to recognize when it has filled
a lazy bytestring's first chunk? It could cater its strictness ( i.e.
vary how much of the spine is forced before any output is generated)
in order to best line up with the chunks of lazy bytestring it is
producing. This might be trying to fit too much into the interface.
And it might even make Put an actual monad ;)
That is something I've considered. Serialise just as much of the list as
is necessary to fill the remainder of a chunk. Actually we'd always fill
just slightly more than a chunk because we don't know how big each list
element will be, we only know when we've gone over.

Duncan

Re: [Haskell-cafe] RFC: demanding lazy instances of Data.Binary

Duncan Coutts