Re: Proposal: add ordNub somewhere in containers

17 Oct 2017


      ...
nubOn :: Ord b => (a -> b) -> [a] -> [a],
or the Foldable version, seem just a bit safer, but less convenient.
I actually find the `...On` variants more convenient. I will often want to
`nubBy ((==) `on` fst` (or `groupBy`), but rarely `nubBy (<=)`.
...
On Oct 17, 2017 2:09 AM, "Gershom B"  wrote:
...
Good point on the by variants. (And indeed this is a public forum).
Those can be written without any fancy machinery, just keeping the
`by` data in the set, in the straightforward way. I agree it makes
sense to add them.
-g
On October 17, 2017 at 1:44:30 AM, Alex Rozenshteyn
(rpglover64@gmail.com) wrote:
...
(Forgive me if I shouldn't be posting on libraries@. I did a little
searching and couldn't determine if this list is supposed to be a public
forum or not)
...
[D]o you have anything else you think should be stuck in the new
module?
As a user, I would expect to find the `...By` variants in the same
location, but that precludes reusing `Set` without relying on
complicated
machinery like `reflection`, doesn't it?
On Mon, Oct 16, 2017 at 3:45 PM David Feuer wrote:
...
I would imagine
ordNub :: (Foldable t, Ord a)
=> t a -> [a]
ordNub xs = foldr go (const []) xs Set.empty where
go x r s
| x `Set.member` s = r s
| otherwise = x : r (Set.insert x s)
which would suggest also
ordNubR :: (Foldable t, Ord a)
=> t a -> [a]
ordNubR xs = foldl go (const []) xs Set.empty where
go r x s
| x `Set.member` s = r s
| otherwise = x : r (Set.insert x s)
For containers biased the other way.
Another question: do you have anything else you think should be stuck
in
the new module?
On Oct 16, 2017 6:18 PM, "Gershom B" wrote:
There have been many discussions over the years about adding an
efficient order preserving nub somewhere to our core libraries. It
always comes down to the same issue: an efficient nub wants to be
backed by an efficient `Set`, but the API of the `nub` itself doesn't
make reference to any other data structures besides lists. So it feels
a bit conceptually strange to put an efficient nub anywhere besides
`Data.List` even though it can't go there without inverting our
dependency tree in a weird way or inlining an efficient set
implementation into the middle of it.
Nonetheless, the convenience of having a good `nub` lying around in a
core library is undeniable, and after writing the "usual" one in my
code for the zillionth time, I decided to raise an issue about it:
https://github.com/haskell/containers/issues/439
I was promptly directed here to make a proper proposal.
So, here:
1) I propose two new functions,
`ordNub` and `intNub`
with the standard implementation (from
https://github.com/nh2/haskell-ordnub):
import qualified Data.Set as Set
ordNub :: (Ord a) => [a] -> [a]
ordNub l = go Set.empty l
where
go _ [] = []
go s (x:xs) = if x `Set.member` s then go s xs
else x : go (Set.insert x s) xs
and the same implementation, but specialized to `Int` and using
`IntSet`s.
The rationale for the names is that the former has a long history of
use in folklore, and the latter is the obvious specialization of it.
2) I propose these functions be added to a new module in the
`containers` library: `Data.Containers.ListUtils`. This can also
potentially in the future add efficient list intersection, etc. as
documented on the above reference link.
The rationale for the new module is that it can provide a meaningful
home for such functions which operate on lists, but require other data
structures to be implemented efficiently...
Discussion period: 2 weeks.
--Gershom
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries