
I am looking for a function f::Eq a => [a]->[a] that takes a list and returns the longest initial segment of the list for which all the elements are distinct. For example f [2,3,6,4,3,5] = [2,3,6,4]. I didn't see anything that matched using Hoogle, but I thought this might be a common enough operation that this function might exists somewhere in the standard packages.

On Wed, Dec 02, 2009 at 06:47:54PM -0600, I. J. Kennedy wrote:
I am looking for a function f::Eq a => [a]->[a] that takes a list and returns the longest initial segment of the list for which all the elements are distinct.
For example f [2,3,6,4,3,5] = [2,3,6,4].
I didn't see anything that matched using Hoogle, but I thought this might be a common enough operation that this function might exists somewhere in the standard packages.
Not that I know of. Here's how I would implement it (although you may enjoy trying to implement it yourself): import qualified Data.Set as S f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums) where cums = scanl (flip S.insert) S.empty xs It works by incrementally building up a list of sets of the elements found in prefixes of the list (with scanl), then goes down the list (takeWhile) checking that each element isn't already in the corresponding set of elements. -Brent

Thanks for the response and thanks for the implementation.
f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums)
where cums = scanl (flip S.insert) S.empty xs
I absolutely love (what I know so far) Haskell, but I must say
when I see this kind of function, or write it myself, I am strongly
reminded of programming in Forth thirty years ago.
On Wed, Dec 2, 2009 at 8:46 PM, Brent Yorgey
On Wed, Dec 02, 2009 at 06:47:54PM -0600, I. J. Kennedy wrote:
I am looking for a function f::Eq a => [a]->[a] that takes a list and returns the longest initial segment of the list for which all the elements are distinct.
For example f [2,3,6,4,3,5] = [2,3,6,4].
I didn't see anything that matched using Hoogle, but I thought this might be a common enough operation that this function might exists somewhere in the standard packages.
Not that I know of. Here's how I would implement it (although you may enjoy trying to implement it yourself):
import qualified Data.Set as S
f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums) where cums = scanl (flip S.insert) S.empty xs
It works by incrementally building up a list of sets of the elements found in prefixes of the list (with scanl), then goes down the list (takeWhile) checking that each element isn't already in the corresponding set of elements.
-Brent _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

On Thu, Dec 3, 2009 at 11:38 AM, I. J. Kennedy
Thanks for the response and thanks for the implementation.
f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums) where cums = scanl (flip S.insert) S.empty xs
When I first saw this function I thought it looked complicated, and in a naive attempt to simplify it I came up with this: import Data.List f :: (Eq a) => [a] -> [a] f = last . takeWhile (\l -> nub l == l) . inits This worked well for short lists, but started to drag for large lists, especially if the result was long, i.e. for input like ([1 .. 1000] ++ [1]). Brent's version seems very fast no matter the list length. Maybe someone can provide a O() analysis? Patrick
I absolutely love (what I know so far) Haskell, but I must say when I see this kind of function, or write it myself, I am strongly reminded of programming in Forth thirty years ago.
On Wed, Dec 2, 2009 at 8:46 PM, Brent Yorgey
wrote: On Wed, Dec 02, 2009 at 06:47:54PM -0600, I. J. Kennedy wrote:
I am looking for a function f::Eq a => [a]->[a] that takes a list and returns the longest initial segment of the list for which all the elements are distinct.
For example f [2,3,6,4,3,5] = [2,3,6,4].
I didn't see anything that matched using Hoogle, but I thought this might be a common enough operation that this function might exists somewhere in the standard packages.
Not that I know of. Here's how I would implement it (although you may enjoy trying to implement it yourself):
import qualified Data.Set as S
f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums) where cums = scanl (flip S.insert) S.empty xs
It works by incrementally building up a list of sets of the elements found in prefixes of the list (with scanl), then goes down the list (takeWhile) checking that each element isn't already in the corresponding set of elements.
-Brent _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- ===================== Patrick LeBoutillier Rosemère, Québec, Canada

Am Donnerstag 03 Dezember 2009 19:16:56 schrieb Patrick LeBoutillier:
On Thu, Dec 3, 2009 at 11:38 AM, I. J. Kennedy
wrote: Thanks for the response and thanks for the implementation.
f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums) where cums = scanl (flip S.insert) S.empty xs
When I first saw this function I thought it looked complicated, and in a naive attempt to simplify it I came up with this:
import Data.List
f :: (Eq a) => [a] -> [a] f = last . takeWhile (\l -> nub l == l) . inits
This worked well for short lists, but started to drag for large lists, especially if the result was long, i.e. for input like ([1 .. 1000] ++ [1]).
Brent's version seems very fast no matter the list length. Maybe someone can provide a O() analysis?
nub is O((length l)^2), thus nubbing all initial segments of l up to length k is O(k^3). Data.Set.insert is O(log size), so calculating the first k elements of scanl (flip insert) empty l is O(log (k!)) = O(k log k). The membership test is again O(log size), zipping is O(k), so altogether O(k*(log k)^2). You can remove one factor (log k) by checking the size instead of membership: f xs = map fst . takeWhile snd . zip xs . zipWith ((. S.size) . (==)) [1 .. ] $ cums where cums = tail $ scanl (flip S.insert) S.empty xs I think for *short* nubbed prefixes, Brent's version is faster, but I've no idea yet when short stops (10, 100, 1000?). A quadratic (I think) version if the type of elements doesn't belong to Ord: f (x:xs) = x:f (takeWhile (/= x) xs) f [] = []
Patrick
I absolutely love (what I know so far) Haskell, but I must say when I see this kind of function, or write it myself, I am strongly reminded of programming in Forth thirty years ago.
On Wed, Dec 2, 2009 at 8:46 PM, Brent Yorgey
wrote: On Wed, Dec 02, 2009 at 06:47:54PM -0600, I. J. Kennedy wrote:
I am looking for a function f::Eq a => [a]->[a] that takes a list and returns the longest initial segment of the list for which all the elements are distinct.
For example f [2,3,6,4,3,5] = [2,3,6,4].
I didn't see anything that matched using Hoogle, but I thought this might be a common enough operation that this function might exists somewhere in the standard packages.
Not that I know of. Here's how I would implement it (although you may enjoy trying to implement it yourself):
import qualified Data.Set as S
f xs = map fst $ takeWhile (uncurry S.notMember) (zip xs cums) where cums = scanl (flip S.insert) S.empty xs
It works by incrementally building up a list of sets of the elements found in prefixes of the list (with scanl), then goes down the list (takeWhile) checking that each element isn't already in the corresponding set of elements.
-Brent

Am Donnerstag 03 Dezember 2009 20:00:00 schrieb Daniel Fischer:
You can remove one factor (log k) by checking the size instead of membership:
f xs = map fst . takeWhile snd . zip xs . zipWith ((. S.size) . (==)) [1 .. ] $ cums where cums = tail $ scanl (flip S.insert) S.empty xs
I think for *short* nubbed prefixes, Brent's version is faster, but I've no idea yet when short stops (10, 100, 1000?).
Couldn't measure a difference for short lists, starts to become faster than Brent's between 10^5 and 10^6. The manual loop f :: Ord a => [a] -> [a] f xs = go 1 S.empty xs where go i s (y:ys) | i == S.size s' = y:go (i+1) s' ys | otherwise = [] where s' = S.insert y s go _ _ _ = [] is ~15% faster.
participants (4)
-
Brent Yorgey
-
Daniel Fischer
-
I. J. Kennedy
-
Patrick LeBoutillier