
Vis a vis NonEmpty, base provides a partial `fromList`, much as I
dislike it. Hence, the proper "roll your own" is `fromList . ordNub .
toList`.
--Gershom
On Wed, Oct 18, 2017 at 5:40 PM, David Feuer
The trouble with NonEmpty is that I don't like the idea of users having to roll their own. They'd end up looking like
case ordNub (toList xs) of [] -> error "can't happen!" x : xs -> x :| xs
I hate "can't happen" errors!
The case for sequences is less compelling with fusion, for sure, but it seems a bit strange to leave them out.
On Oct 18, 2017 5:28 PM, "Gershom B"
wrote: I agree that `ordNubOn` suffices and we don't need `ordNubBy` since there's nothing lawful you can do with the latter that you can't do with the former. I'm indifferent on the NonEmpty and Seq case as I don't suspect that they will yield much more efficient implementations than going via lists, especially if we setup (and we should!) the fusion rules correctly. I have no objection to adding them for completeness however.
If we do add them, then the proposed module name `Data.Containers.ListUtils` becomes slightly less appropriate, but I think still fine, since these are "morally" all lists of various sorts.
-g
On Wed, Oct 18, 2017 at 1:49 PM, David Feuer
wrote: I am convinced that we should add
ordNub :: Ord a => [a] -> [a] ordNubOn :: Ord b => (a -> b) -> [a] -> [b] intNub :: [Int] -> [Int] intNubOn :: (a -> Int) -> [a] -> [a]
And because nub preserves non-emptiness, I believe we should also offer
ordNub1 :: Ord a => NonEmpty a -> NonEmpty a ordNubOn1 :: Ord b => (a -> b) -> NonEmpty a -> NonEmpty a intNub1 :: NonEmpty Int -> NonEmpty Int intNubOn1 :: (a -> Int) -> NonEmpty a -> NonEmpty a
I imagine we should also add these operations for Data.Sequence.Seq.
I'm not yet convinced that we should add
ordNubBy :: (a -> a -> Ordering) -> [a] -> [a]
but I'm open to further discussion of that question. My main concern is that the properties of the comparison argument require careful documentation. In its favor, using it improperly cannot *expose* a broken Set to later operations.
I would very much like to hear further bikeshedding around names and namespaces.
On Oct 16, 2017 6:18 PM, "Gershom B"
wrote: There have been many discussions over the years about adding an efficient order preserving nub somewhere to our core libraries. It always comes down to the same issue: an efficient nub wants to be backed by an efficient `Set`, but the API of the `nub` itself doesn't make reference to any other data structures besides lists. So it feels a bit conceptually strange to put an efficient nub anywhere besides `Data.List` even though it can't go there without inverting our dependency tree in a weird way or inlining an efficient set implementation into the middle of it.
Nonetheless, the convenience of having a good `nub` lying around in a core library is undeniable, and after writing the "usual" one in my code for the zillionth time, I decided to raise an issue about it:
https://github.com/haskell/containers/issues/439
I was promptly directed here to make a proper proposal.
So, here:
1) I propose two new functions,
`ordNub` and `intNub`
with the standard implementation (from https://github.com/nh2/haskell-ordnub):
import qualified Data.Set as Set
ordNub :: (Ord a) => [a] -> [a] ordNub l = go Set.empty l where go _ [] = [] go s (x:xs) = if x `Set.member` s then go s xs else x : go (Set.insert x s) xs
and the same implementation, but specialized to `Int` and using `IntSet`s.
The rationale for the names is that the former has a long history of use in folklore, and the latter is the obvious specialization of it.
2) I propose these functions be added to a new module in the `containers` library: `Data.Containers.ListUtils`. This can also potentially in the future add efficient list intersection, etc. as documented on the above reference link.
The rationale for the new module is that it can provide a meaningful home for such functions which operate on lists, but require other data structures to be implemented efficiently...
Discussion period: 2 weeks.
--Gershom _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries