Proposal: free shrinking with QuickCheck

22 Nov 2013

      I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.

---

I have an idea to eliminate the `shrink` function from the `Arbitrary` type
class. Currently, users can optionally implement shrinking manually, this tends
to be boilerplate code and is "...clearly a generic programming problem" [1].
Further, users often have to create separate types to accommodate their
domain's restrictions. For example, a user may wish to only generate
power-of-two numbers for their test. Their code simply uses `Int`, but with
QuickCheck they must create a `newtype` wrapper and implement this logic in
both the `arbitrary` and `shrink` implementation. My idea is to
eliminate the `shrink` function, and to integrate shrinking with the
`arbitrary` function. Currently, the type for `arbitrary` is:

    arbitrary :: Gen a

and `Gen` has the definition:

    newtype Gen a = MkGen{ unGen :: StdGen -> Int -> a }

I suggest that instead of the inner-function returning `a`s, it should return
`RoseTree a`. The root of the tree is the generated value, and the rest of the
tree is all of the ways to shrink the value. Here's how things would fit
together:

    data RoseTree a = RoseTree a [RoseTree a]

    -- this is the same as the current Gen
    newtype Generator a = Gen { unGen :: StdGen -> Int -> a }

    newtype Gen a = MkGen { unGen :: Gen (RoseTree a) }

Conveniently, `Gen` still has a unary type constructor of `a`. Further, if
users have implemented their `arbitrary` implementations in terms of the
QuickCheck combinators, their code won't have to change at all (I don't think…).
The lower-level combinators would be updated to manipulate trees, with
functions like `joinRose` and `filterRose`. Let's next look at how `Gen`
would implement the monad type class:

    instance Monad Gen where
        return = MkGen . return . return

        gen >>= f = MkGen $ helper (unGen gen) (unGen . f)
            -- sequence is from Data.Traversable
            where helper m k = m >>= \y -> fmap joinRose $ sequence $ fmap k y

The implementation of `return` is clear. `>>=` isn't too bad either: the
function provided to bind will return `Gen b`'s, and `joinRose` and `sequence`
help us pull this `Generator (RoseTree b))` 'out', much like `join` does. This
means our users can still write code like:

    (arbitrary :: Gen [Int]) >>= elements

Which brings up a good point. The above code has an issue, `elements` is a
partial function with respect to the empty list. With the current
implementation, we'd use the `NonEmptyList` modifier, which much respect this
predicate both during generation and shrinking. This change would allow all
predicates to be expressed simply with `suchThat`, which, since it acts on both
values _and_ shrink trees, applies the predicate in both places. The
implementation of `suchThat` would have to unwrap `Gen`, and apply `roseFilter`
to the `RoseTree` inside of `Generator`, using `fmap`.

I have implemented the above description in my Clojure port of QuickCheck,
called simple-check [1], and it seems to be working quite nicely. Further, I
suspect Erlang QuickCheck [3] is doing something similar, though their
implementation is not open source, I can't presume too much.

Clearly this would be a big change, and my goal now is simply to start a
discussion: how does this sound? What's wrong, what's likely to break? Feedback
and criticism welcome.

Reid

[1] Scrap your boilerplate with class: extensible generic functions: http://research.microsoft.com/pubs/67439/gmap3.pdf

[2] https://github.com/reiddraper/simple-check

[3] http://www.quviq.com/index.html

Reid Draper

Carter Schonwald

Bryan O'Sullivan

Reid Draper

Joseph Abrahamson

Reid Draper

Bryan O'Sullivan

Reid Draper

tags

participants (4)