
On Mon, Mar 19, 2007 at 11:22:47AM +1100, Duncan Coutts wrote:
On Mon, 2007-03-19 at 11:02 +1100, Donald Bruce Stewart wrote:
I propose we *do not* change the api to add the special case:
sortNub = sort . nub = map head . group . sort
and instead add a rewrite rule to Data.List to provide this optimisation.
{-# RULES "sort/nub" sort . nub = map head . group . sort #-}
Though of course we'd want to use the more efficient implementation than that. One can write a sort that directly eliminates duplicates, and I think you'd want a rule for nub . sort as well as sort . nub.
And while we're at it we can add a rule to rewrite all the places people have used map head . group . sort (or the goupBy f . sortBy f equivalents) to this equals-eliminating sort implementation.
note, this is most likely not what you want at all:
{-# RULES "sort/nub" sort . nub = map head . group . sort #-}
desugars to
{-# RULES "sort/nub" (.) sort nub = map head . group . sort #-}
meaning you are actually adding a RULE to (.), not sort or nub. this hasa couple of unfortunate consequences. * people will have to use 'sort . nub' rather than something like sort $ nub xs * (.) will no longer be inlined until much later in the optimization passes, which perhaps could have disasterous consequences that said, I do not believe this is an appropriate use of rules to begin with, it changes the asymptotic complexity, so should be available as an explicit option, and is just a generally useful routine. not that rules can't also exist, but attached to the right thing. (sort (nub xs)) = nubSort xs (nub (sort xs)) = nubSort xs now, I would also like to see (and it might turn out to be more useful), fastNub, which is a nub that uses Ord and sets to be just as fast, but works on infinite lists. accidentally evaluating a whole list in memory is the quickest way to a space leak and 'sort' is the most common culprit. this is why 'nub' can often be _faster_ than sortNub when you only partially evaluate a list, with ordNub, you get the benefits of both! John -- John Meacham - ⑆repetae.net⑆john⑈