
On Nov 19, 2007 4:16 PM, Duncan Coutts
On Mon, 2007-11-19 at 13:39 -0800, Don Stewart wrote:
nicolas.frisby:
*snip*
1) The fact that serialisation is fully strict for 32760 bytes but not for 32761 makes the direct application of strictCheck intractable. Do you have any ideas how to circumvent that?
Test using a much smaller chunk size. I'd test sizes from 1 to something like one more than the machine word size.
Let me clarify "circumvent that." strictCheck uses a bounded search starting at 1 and proceeds to some limit. The Binary instance used in the example was the fully lazy one for lists: a get/putWord8 for each constructor. Even so, it was effectively spine-strict up to 32k bytes (which was 32k elements b/c of the use of unit) because (I think that) the first chunk of the lazy bytestring wasn't being generated by encode until it was full. If you asked strictCheck to go from 1 to 32k, I don't think it would finish. So by circumvent, I mean: How can we apply the essential ideas of strictCheck when our chunks are so big? Obviously, the iterative search cannot just proceed by one element at a time; but then we lose the obvious meaning of "add one more _|_. I don't see an obvious candidate for how to alter the _|_-ridden test vector generation. Moreover, it's "proposed output" is wrong when considered from the Big Chunks perspective--we don't necessarily want Chitil's least strictness. (more new text below)
2) Also, any suggestions for other datatypes to provide default instances for? Tree type structures immediately raise the question of which traversal should be the default. I'm learning towards providing none since the goal of constant space usage actually depends on the serialisation order matching how the deserialised tree will be traversed.
Lazy Arrays?
3) I don't think it is applicable in anyway whatsoever to strict types like Int, Data.Set, and Data.Sequence? Counter-arguments?
Well, atomic types like Int, I don't think it makes sense, but Sets and Sequence are lazy, aren't they?
Sequences are like spine strict lists. Sets are strict in as much as the element type's (==) function is strict.
Let me refine how I posed that question. A predicate: if you enter "Package.fromList [1..]" at the ghci prompt and you get no intermediate results, then that was a "strict type." I'm assuming that if the Show instance doesn't produce intermediate results, then the serialisation technique can't handle intermediate results (i.e. chunking) either--at least not in a general enough way to include it in a library. *snip*