Please apply the comparison function given to nubBy to elements of the list in the order in which they occur in the list.

I just tried this in ghci-7.0.3: ghci> nubBy (>=) [1,2,3,4] [1] Think about what this is doing: it is excluding 2 from the list because 2 >= 1, rather than including it because 1 >= 2 fails. I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list. If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4] I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless. I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just came up again for me, and I decided that I was sufficiently irritated by it to post again. Another thing perhaps worth pointing out is that the parameters to mapAccumR have always been backwards (compare it with foldr). Few enough people use this function that I'm fairly sure we could just change it without harm. - Cale

Looking at the code of nubBy http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.4.0.0/src/Data-... nubBy :: (a -> a -> Bool) -> [a] -> [a] #ifdef USE_REPORT_PRELUDE nubBy eq [] = [] nubBy eq (x:xs) = x : nubBy eq (filter (\ y -> not (eq x y)) xs) #else nubBy eq l = nubBy' l [] where nubBy' [] _ = [] nubBy' (y:ys) xs | elem_by eq y xs = nubBy' ys xs | otherwise = y : nubBy' ys (y:xs) -- Not exported: -- Note that we keep the call to `eq` with arguments in the -- same order as in the reference implementation -- 'xs' is the list of things we've seen so far, -- 'y' is the potential new element elem_by :: (a -> a -> Bool) -> a -> [a] -> Bool elem_by _ _ [] = False elem_by eq y (x:xs) = y `eq` x || elem_by eq y xs #endif I see that the USE_REPORT_PRELUDE version corresponds to your proposal, but the actual implementation (based on elem_by) behaves differently despite the "same order" comment! Therefore I support your proposal to change "y `eq` x" in elem_by (and possibly improve the documentation). Cheers Christian Am 08.09.2011 02:07, schrieb Cale Gibbard:
I just tried this in ghci-7.0.3:
ghci> nubBy (>=) [1,2,3,4] [1]
Think about what this is doing: it is excluding 2 from the list because 2>= 1, rather than including it because 1>= 2 fails.
I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list.
If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4]
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just came up again for me, and I decided that I was sufficiently irritated by it to post again.
Another thing perhaps worth pointing out is that the parameters to mapAccumR have always been backwards (compare it with foldr). Few enough people use this function that I'm fairly sure we could just change it without harm.
- Cale

Looking at the old tickets http://hackage.haskell.org/trac/ghc/ticket/2528 http://hackage.haskell.org/trac/ghc/ticket/3280 the USE_REPORT_PRELUDE version of nub is also different from the implementation, but both variants fulfill "nub = nubBy (==)" (the prelude version by definition). So when we change the nubBy implmentation we also need to change the nub implementation (which is more difficult, because it uses "elem"). Cheers Christian Am 20.09.2011 12:59, schrieb Christian Maeder:
Looking at the code of nubBy http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.4.0.0/src/Data-...
nubBy :: (a -> a -> Bool) -> [a] -> [a] #ifdef USE_REPORT_PRELUDE nubBy eq [] = [] nubBy eq (x:xs) = x : nubBy eq (filter (\ y -> not (eq x y)) xs) #else nubBy eq l = nubBy' l [] where nubBy' [] _ = [] nubBy' (y:ys) xs | elem_by eq y xs = nubBy' ys xs | otherwise = y : nubBy' ys (y:xs)
-- Not exported: -- Note that we keep the call to `eq` with arguments in the -- same order as in the reference implementation -- 'xs' is the list of things we've seen so far, -- 'y' is the potential new element elem_by :: (a -> a -> Bool) -> a -> [a] -> Bool elem_by _ _ [] = False elem_by eq y (x:xs) = y `eq` x || elem_by eq y xs #endif
I see that the USE_REPORT_PRELUDE version corresponds to your proposal, but the actual implementation (based on elem_by) behaves differently despite the "same order" comment!
Therefore I support your proposal to change "y `eq` x" in elem_by (and possibly improve the documentation).
Cheers Christian
Am 08.09.2011 02:07, schrieb Cale Gibbard:
I just tried this in ghci-7.0.3:
ghci> nubBy (>=) [1,2,3,4] [1]
Think about what this is doing: it is excluding 2 from the list because 2>= 1, rather than including it because 1>= 2 fails.
I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list.
If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4]
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just came up again for me, and I decided that I was sufficiently irritated by it to post again.
Another thing perhaps worth pointing out is that the parameters to mapAccumR have always been backwards (compare it with foldr). Few enough people use this function that I'm fairly sure we could just change it without harm.
- Cale

Apologies for responding to myself, but the difference between the REPORT_PRELUDE and the ghc implementation also applies to elem and notElem. #ifdef USE_REPORT_PRELUDE elem x = any (== x) notElem x = all (/= x) #else elem _ [] = False elem x (y:ys) = x==y || elem x ys notElem _ [] = True notElem x (y:ys)= x /= y && notElem x ys #endif So the proposal should be to swap the arguments in "x==y" and "x /= y" (above) which would also fix the nub implementation! C. Am 20.09.2011 13:46, schrieb Christian Maeder:
Looking at the old tickets http://hackage.haskell.org/trac/ghc/ticket/2528 http://hackage.haskell.org/trac/ghc/ticket/3280
the USE_REPORT_PRELUDE version of nub is also different from the implementation, but both variants fulfill "nub = nubBy (==)" (the prelude version by definition).
So when we change the nubBy implmentation we also need to change the nub implementation (which is more difficult, because it uses "elem").
Cheers Christian
Am 20.09.2011 12:59, schrieb Christian Maeder:
Looking at the code of nubBy http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.4.0.0/src/Data-...
nubBy :: (a -> a -> Bool) -> [a] -> [a] #ifdef USE_REPORT_PRELUDE nubBy eq [] = [] nubBy eq (x:xs) = x : nubBy eq (filter (\ y -> not (eq x y)) xs) #else nubBy eq l = nubBy' l [] where nubBy' [] _ = [] nubBy' (y:ys) xs | elem_by eq y xs = nubBy' ys xs | otherwise = y : nubBy' ys (y:xs)
-- Not exported: -- Note that we keep the call to `eq` with arguments in the -- same order as in the reference implementation -- 'xs' is the list of things we've seen so far, -- 'y' is the potential new element elem_by :: (a -> a -> Bool) -> a -> [a] -> Bool elem_by _ _ [] = False elem_by eq y (x:xs) = y `eq` x || elem_by eq y xs #endif
I see that the USE_REPORT_PRELUDE version corresponds to your proposal, but the actual implementation (based on elem_by) behaves differently despite the "same order" comment!
Therefore I support your proposal to change "y `eq` x" in elem_by (and possibly improve the documentation).
Cheers Christian
Am 08.09.2011 02:07, schrieb Cale Gibbard:
I just tried this in ghci-7.0.3:
ghci> nubBy (>=) [1,2,3,4] [1]
Think about what this is doing: it is excluding 2 from the list because 2>= 1, rather than including it because 1>= 2 fails.
I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list.
If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4]
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just came up again for me, and I decided that I was sufficiently irritated by it to post again.
Another thing perhaps worth pointing out is that the parameters to mapAccumR have always been backwards (compare it with foldr). Few enough people use this function that I'm fairly sure we could just change it without harm.
- Cale

If this is a _proposal_ to change ghc's non-Report-compatible Data.List implementation to match the behaviour of the Report implementation, then count me as a +1.
I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list.
This seems like an entirely reasonable principle.
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
I notice that the Haskell'98 Report gives a sample implementation, but the Haskell'2010 Report does not. I wonder if this is a regression, since in days of yore, the Report implementation was treated as a gold standard reference for questions about function semantics, strictness and so forth. Regards, Malcolm

On Thursday 08 September 2011, 02:07:39, Cale Gibbard wrote:
If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4]
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations,
I find that not so strange, another obvious reimplementation is the one you get if you compile Data.List with -DUSE_REPORT_PRELUDE, nubBy eq [] = [] nubBy eq (x:xs) = x : nubBy eq (filter (\ y -> not (eq x y)) xs) If the relation defined by the predicate argument is not transitive, you get a different result than with your definition above. Of course, the report could make a choice and say that behaviour is authoritative, but while the expected behaviour is clear for an equivalence relation (if you don't distinguish different bottoms), it's not so obvious otherwise.
but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
Yes, although I don't think nubBy is a perfect name for that function - and, it's also only useful for transitive relations, as far as I can see.
I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just
Nobody made a formal proposal, so it withered and died.
came up again for me, and I decided that I was sufficiently irritated by it to post again.
Sufficiently irritated to make a formal proposal? If nobody does that, it probably won't change (and I'm not interested enough in the matter to make one, but would support).
Another thing perhaps worth pointing out is that the parameters to mapAccumR have always been backwards (compare it with foldr). Few enough people use this function that I'm fairly sure we could just change it without harm.
Supposing that the last assessment is not wildly incorrect, +1.

On Wed, Sep 7, 2011 at 8:07 PM, Cale Gibbard
I just tried this in ghci-7.0.3:
ghci> nubBy (>=) [1,2,3,4] [1]
Think about what this is doing: it is excluding 2 from the list because 2 >= 1, rather than including it because 1 >= 2 fails.
I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list.
If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4]
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just came up again for me, and I decided that I was sufficiently irritated by it to post again.
I would suggest you rephrase this as a formal proposal, then I can happily vote +1. I'd also suggest rephrasing rhe mapAccumR as a formal proposal. I'm not sure yet of whether or not I'd be behind that one, but make both proposals separately, so they can pass individually. A short email with the proposed change, and the "Discussion period: 2 weeks" will either make this happen or not. This informal discussion, as it isn't backing an actual proposal is just going to burn everyone out on the topic and leave us where we started. -Edward

Am 20.09.2011 20:21, schrieb Edward Kmett: [...]
I would suggest you rephrase this as a formal proposal, then I can happily vote +1.
Seeing the "wonderful" interrelation between elem, nub, nubBy and i.e. unionBy eq xs ys = xs ++ foldl (flip (deleteBy eq)) (nubBy eq ys) xs intersectBy eq xs ys = [x | x <- xs, any (eq x) ys] (note that "any (eq x)" could be "elemBy eq") I see hardly a chance to make a sensible proposal. I think, it is wrong to change the implementation of "elem" and "notElem" since I expect the "key" to be the first argument of the eq-comparison (in contrast to the REPORT_PRELUDE!). But this all would not matter if the eq-function are always symmetric, which may be not the case in practise. So a change could break existing code.
I'd also suggest rephrasing rhe mapAccumR as a formal proposal. I'm not sure yet of whether or not I'd be behind that one, but make both proposals separately, so they can pass individually.
I also don't see a relation to mapAccumR, so why don't you make such a separate proposal? C.

In case you further want to discuss this, I've re-opened http://hackage.haskell.org/trac/ghc/ticket/2528#comment:10 So, I'm against your proposal, Cale, but suggest that you revert the order in your example (if you want to exploit this behavior). Cheers Christian Am 08.09.2011 02:07, schrieb Cale Gibbard:
I just tried this in ghci-7.0.3:
ghci> nubBy (>=) [1,2,3,4] [1]
Think about what this is doing: it is excluding 2 from the list because 2>= 1, rather than including it because 1>= 2 fails.
I think an important convention when it comes to higher order functions on lists is that to the extent which is possible, the function parameters take elements from the list (or things computed from those) in the order in which they occur in the original list.
If we reimplement it in the obvious way: ghci> let nubBy f [] = []; nubBy f (x:xs) = x : filter (not . f x) (nubBy f xs) ghci> nubBy (>=) [1,2,3,4] [1,2,3,4]
I'm aware that the Report (strangely!) explicitly leaves the behaviour of nubBy unspecified for functions which are not equivalence relations, but the behaviour given by the Report implementation (the opposite of the current behaviour in GHC) is useful and desirable nonetheless.
I'm sure I've written about this before. I'm not entirely sure what happened to the previous thread of discussion about this, but it just came up again for me, and I decided that I was sufficiently irritated by it to post again.
Another thing perhaps worth pointing out is that the parameters to mapAccumR have always been backwards (compare it with foldr). Few enough people use this function that I'm fairly sure we could just change it without harm.
- Cale
participants (5)
-
Cale Gibbard
-
Christian Maeder
-
Daniel Fischer
-
Edward Kmett
-
Malcolm Wallace