
On Thu, Feb 5, 2015 at 6:06 AM, Roman Cheplyaka
The choice of what belongs to what depth is rather arbitrary.
You say you'd expect all three constructors to be of depth 0. What about
data A = A1 .. A50
?
What about Char or Int?
See the problem?
I see the issue, but I think the current SmallCheck documentation does not acknowledge this issue and the documentation in fact obstructs understanding of the way the library currently works. First, Runciman 2008 does not say that the choice of what belongs to what depth is arbitrary. Rather, it states that "the depth of a zero-arity construction is zero, and the depth of a positive-arity construction is one greater than the maximum depth of a component argument." Thus the depth of True would be zero. In data A = A1 .. A50 the depth of each value would be zero. The Runciman 2008 theory holds perfectly for Int if one conceptualizes Int to be built as follows: data Sign = Pos | Neg data Nat = One | Succ Nat data Int = Zero | NonZero Sign Nat In that case, the depth of Zero is 0. The depth of 1 (that is, NonZero Pos One)) is one greater than the maximum depth of a component argument; since the maximum depth of a component is zero, that gives us depth 1. Similarly, the depth of NonZero Pos (Succ One) is 2. I acknowledge that Runciman 2008 did not explain this as well as it might have. I have puzzled over it off and on for months, though the fact that the current SmallCheck is not consistent with this explanation did not help me. Further, my explanation of Int is not the one Runciman 2008 gave, as it ignores that an Int might be negative. In addition, since Int is bounded, you could indeed argue that every member of Int should indeed have depth 0 because then your @data A = A1 .. A50@ conceptualization is more appropriate. However, the conceptualization given above is the only one that holds for Integer. I also acknowledge that the Runciman 2008 theory was not used for all types in the original SmallCheck; for instance, Char would have been treated differently. Even so, the Runciman 2008 theory certainly holds for types with only a few possible values (such as (), Bool, Maybe, etc.) and the original SmallCheck was faithful to the Runciman 2008 theory in those cases. Upon further examination of the current SmallCheck source, I see that 'cons0' decreases the depth, while its counterpart in the original SmallCheck did not. What I was wondering is whether this change was made for any particular reason. Perhaps the explanation given in Runciman 2008 is not, in fact, a useful one in practice because of the existence of bounded types with large numbers of values, such as Char and Int. In that case, though, it would be useful if the current SmallCheck documentation abandoned the Runciman 2008 explanation. "Test.SmallCheck.Series" still states that "For data values, [depth] is the depth of nested constructor applications." After puzzling over what on earth "nested constructor applications" means, and then consulting Runciman 2008, I was only more confused after seeing that the current SmallCheck behaves in a way that is inconsistent with its own documentation, with Runciman 2008, and with the original SmallCheck. Perhaps, then, if the current theory underlying SmallCheck is that depth is just some size-limiting thing, the documentation should say something like "Depth is a non-negative integer. You use depth to make progressively 'larger' values. What 'larger' means is up to you. Runciman 2008 said it means the depth of nested constructor applications, but that is not always practically sensible. You should, however, ensure that all the values included at @depth n@ are also included in @depth n + 1@. " Even with an explanation like that, though, I don't see why @list 0@ should sometimes return values (as it does for () and Int) and sometimes not (as it does with Bool). That's a puzzling behavior that did not exist in the original SmallCheck. I bring all this up because the implementation of SmallCheck is in some ways quite subtle. In the past I have considered using it but found the depth concept to be bewildering. In contrast, QuickCheck's Arbitrary makes no claims as to theoretical consistency--Gen Int might give a huge range of Int, or it might return 5 every time. Indeed, that is why there is a whole batch of bindings in Test.QuickCheck for Gen Int of different ranges. I doubt I am the only one who found the inconsistency between the SmallCheck docs and its behavior to be confusing. Now that I finally understand the idea of depth, I might actually use SmallCheck. Thanks. --Omari