
On 12 Sep 2007, at 8:08 pm, rahn@ira.uka.de wrote:
take 1000 [1..3] still yields [1,2,3]
You can think about take n as: Take as much as possible, but at most n elements. This behavior has some nice properties as turned out by others, but there are some pitfalls.
One of the very nice properties is this: for all n >= 0, m >= 0 take m . take n = take (m+n) drop m . drop n = drop (m+n) And another is this: for all xs, n >= 0 take n xs ++ drop n xs == xs
length . take n /= const n
in general, instead only
length . take n `elem` map const [0..n]
What we have is for all xs, n >= 0 n <= length xs => length (take n xs) == n Suppose we had the "unsafe" take. Would we then have length . take n == const n? NO! For some xs, (length . take n) xs would be bottom, while (const n) xs would be n. As far as I can see, the strongest statement about length that unsafeTake would satisfy is for all xs, n >= 0 n <= length xs => length (unsafeTake n xs) == n which is, mutatis mutandis, the SAME law that applies to the `take` we already have.
Therefore head . length n is unsafe for all n, even strict positive ones.
Moreover, it is easy to produce tricky code. Assume you want to know, whether the prefixes of length k are equal and write
pref_eq k xs ys = take k xs == take k ys
This seems to be a straightforward implementation with good properties. Actually, no, at least not if implemented naively. (I wonder just how clever GHC is with this code? I can't check at the moment. After heroic efforts by my sysadmin, I was finally able to install the binary release of GHC 6.6.1, but (a) it assumes that gcc accepts a -fwrapv flag, which gcc 3.3.2 (TWW) does not accept, and (b) after patching
I'm assuming that should have been (head . take n). The thing is that (head . take n) fails when and only when (head . unsafeTake n) fails; the only difference is which function reports the error. that, gcc still chokes on the output of GHC.) The obvious question is what pref_eq k xs ys *should* do when xs or ys is not at least k long. Does it mean "I believe xs and ys are at least k long: if they are, check their prefixes, if they aren't, raise an error" or does it mean "xs and ys are at least k long and their prefixes of length k are equal"?
Uhh, this is not what everybody expects at first glance. In particular, if
pref_eq k xs ys == False
then *not* necessarily
pref_eq k xs (init ys) == False.
As always, quickCheck is your friend to assure (or reject) such a property.
That is certainly so, but the fundamental issue in this example is not what take does, but "don't leap into coding a function before you know what it means". We've now seen three readings of what pref_eq means. In particular, pref_eq k xs ys = unsafeTake k xs == unsafeTake k ys would NOT satisfy the law that pref_eq k xs ys == False => pref_eq k xs (init ys) == False because pref_eq 2 [1,2] [1,3] == False but pref_eq 2 [1,2] [1] => undefined So "fixing" take would not, in this case, produce the desired behaviour.