
On Mon, 2007-11-19 at 20:22 -0600, Nicolas Frisby wrote:
On Nov 19, 2007 4:16 PM, Duncan Coutts
wrote: On Mon, 2007-11-19 at 13:39 -0800, Don Stewart wrote:
nicolas.frisby:
*snip*
1) The fact that serialisation is fully strict for 32760 bytes but not for 32761 makes the direct application of strictCheck intractable. Do you have any ideas how to circumvent that?
Test using a much smaller chunk size. I'd test sizes from 1 to something like one more than the machine word size.
Let me clarify "circumvent that." strictCheck uses a bounded search starting at 1 and proceeds to some limit. The Binary instance used in the example was the fully lazy one for lists: a get/putWord8 for each constructor. Even so, it was effectively spine-strict up to 32k bytes (which was 32k elements b/c of the use of unit) because (I think that) the first chunk of the lazy bytestring wasn't being generated by encode until it was full. If you asked strictCheck to go from 1 to 32k, I don't think it would finish. So by circumvent, I mean: How can we apply the essential ideas of strictCheck when our chunks are so big?
We don't. We test it with a variety of small chunk sizes. That is the sensible thing to do.
Obviously, the iterative search cannot just proceed by one element at a time; but then we lose the obvious meaning of "add one more _|_. I don't see an obvious candidate for how to alter the _|_-ridden test vector generation. Moreover, it's "proposed output" is wrong when considered from the Big Chunks perspective--we don't necessarily want Chitil's least strictness.
Indeed. As I've said, Data.Binary is lazy but in a chunky way where within each chunk it is strict.
Sequences are like spine strict lists. Sets are strict in as much as the element type's (==) function is strict.
Let me refine how I posed that question. A predicate: if you enter "Package.fromList [1..]" at the ghci prompt and you get no intermediate results, then that was a "strict type."
Right, because (==) for Int is strict, and Set.fromList uses (==) on each element. Sorry, I was just being pedantic by saying that it depends on the strictness of (==).
I'm assuming that if the Show instance doesn't produce intermediate results, then the serialisation technique can't handle intermediate results (i.e. chunking) either--at least not in a general enough way to include it in a library.
So if you did this test with my proposed list instance (and you somehow slowed your computer right down so you could see what was going on) you'd see it wait a sec, then print out 32k of serialised list elements, then wait again and emit another chunk. So lazy, but in strict chunks. Duncan