Duplicates in arbitrary instances for collections

Every QuickCheck tutorial makes the point that, given a property on some type, the Arbitrary instance for that type is written in such a way that nearly all of the interesting edge-cases of that type are tried. So e.g. for numeric types, positive, negative, and zero numbers are tried. For collections, I think the case of duplicate elements is insufficiently covered. Say we have a list difference function: badDiff :: Eq a => [a] -> [a] -> [a] badDiff l r = filter (\e -> e `notElem` r) l And a test for it: prop_wrong :: [Int] -> [Int] -> Bool prop_wrong a b = (a `badDiff` b) == (a \\ b) `quickCheck prop_wrong` passes about 80% of the time! Obviously it shouldn't:
badDiff [1,2,3,3,3] [2,3] [1] (\\) [1,2,3,3,3] [2,3] [1,3,3]
But even with maxSuccess at 1000, this property passes at least 2/3 of the times it's run (on my machine, where (maxBound :: Int) is large). The reason is that `arbitrary` for [a] is just `sized $ \n -> choose (0,n) >>= \k -> sequence [arbitrary | _ <- [1..k]]` It would be trivial to duplicate values in the collection a certain percentage of the time (possibly sometimes adjacent). Is there a good reason not to? It shouldn't require an Eq constraint. Tom
participants (1)
-
Tom Murphy