Time consumption nub

newer
Is this haskelly enough? -- errm...

Arie Groeneveld

18 Jul 2007 18 Jul '07

11:38 a.m.

Hi, Wondering about time space consuming: 'nub' vs 'map head.group.sort' Consider: ry = [1..10000] ++ replicate 13 5 ++ replicate 21 34 *Main> length . nub $ ry 10000 (5.18 secs, 1050000 bytes) *Main> length . map head . group . sort $ ry 10000 (0.03 secs, 6293384 bytes) Time space nub --- + fnub +++ - + is better ;-) Thanks @@i=arie

Show replies by date

Miguel Mitrofanov

18 Jul 18 Jul

12:19 p.m.

AG> Wondering about time space consuming: 'nub' vs 'map AG> head.group.sort' Prelude> :t Data.List.nub Data.List.nub :: (Eq a) => [a] -> [a] Prelude> :t Data.List.sort Data.List.sort :: (Ord a) => [a] -> [a] nub uses less information than sort, so it MUST be slower.

haskell＠list.mightyreason.com

12:43 p.m.

Arie Groeneveld wrote:

...

Hi,

Wondering about time space consuming: 'nub' vs 'map head.group.sort'

Consider:

ry = [1..10000] ++ replicate 13 5 ++ replicate 21 34

*Main> length . nub $ ry 10000 (5.18 secs, 1050000 bytes)

*Main> length . map head . group . sort $ ry 10000 (0.03 secs, 6293384 bytes)

Time space nub --- + fnub +++ -

+ is better ;-)

Thanks

@@i=arie

nub is working on unsorted input. If you want (nub.sort) then the best thing to use is a merge sort that discards duplicates as it works. Copying and modifying GHC's Data.List.sort code:

...

-- stolen from http://darcs.haskell.org/packages/base/Data/List.hs -- with 'merge' changed to discard duplicates.

...

nsort l = mergesort compare l nsortBy cmp l = mergesort compare l

mergesort :: (a -> a -> Ordering) -> [a] -> [a] mergesort cmp = mergesort' cmp . map wrap

mergesort' :: (a -> a -> Ordering) -> [[a]] -> [a] mergesort' cmp [] = [] mergesort' cmp [xs] = xs mergesort' cmp xss = mergesort' cmp (merge_pairs cmp xss)

merge_pairs :: (a -> a -> Ordering) -> [[a]] -> [[a]] merge_pairs cmp [] = [] merge_pairs cmp [xs] = [xs] merge_pairs cmp (xs:ys:xss) = merge cmp xs ys : merge_pairs cmp xss

merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a] merge cmp xs [] = xs merge cmp [] ys = ys merge cmp (x:xs) (y:ys) = case x `cmp` y of GT -> y : merge cmp (x:xs) ys LT -> x : merge cmp xs (y:ys) EQ -> x : merge cmp xs ys

wrap :: a -> [a] wrap x = [x]

Then you can use nsort or nsortBy, which benchmark (with -O2) as slightly faster than (map head . group . sort) Cheers, Chris

Arie Groeneveld

19 Jul 19 Jul

6:50 p.m.

I tried this version, however....

...

...
nsortBy cmp l = mergesort compare l should be:

nsortBy cmp l = mergesort cmp l Thanks

Arie Groeneveld

7:12 p.m.

I tried this version,

...

however....

...
...
nsortBy cmp l = mergesort compare l should be:

nsortBy cmp l = mergesort cmp l

Thanks Sorry, with this version I meant:

...

nsort l = mergesort compare l nsortBy cmp l = mergesort compare l

mergesort :: (a -> a -> Ordering) -> [a] -> [a] mergesort cmp = mergesort' cmp . map wrap

mergesort' :: (a -> a -> Ordering) -> [[a]] -> [a] mergesort' cmp [] = [] mergesort' cmp [xs] = xs mergesort' cmp xss = mergesort' cmp (merge_pairs cmp xss)

merge_pairs :: (a -> a -> Ordering) -> [[a]] -> [[a]] merge_pairs cmp [] = [] merge_pairs cmp [xs] = [xs] merge_pairs cmp (xs:ys:xss) = merge cmp xs ys : merge_pairs cmp xss

merge :: (a -> a -> Ordering) -> [a] -> [a] -> [a] merge cmp xs [] = xs merge cmp [] ys = ys merge cmp (x:xs) (y:ys) = case x `cmp` y of GT -> y : merge cmp (x:xs) ys LT -> x : merge cmp xs (y:ys) EQ -> x : merge cmp xs ys

wrap :: a -> [a] wrap x = [x]

6575

Age (days ago)

6576

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Arie Groeneveld
haskell＠list.mightyreason.com
Miguel Mitrofanov