select and selectSplit

Cale Gibbard

15 Feb 2008 15 Feb '08

3:15 a.m.

I know Bulat will be terribly disappointed by my suggestion to make an addition to Data.List ;) but I mentioned the following couple of functions on #haskell and got some positive response that they were things other people ended up writing all the time as well, so I decided I'd propose them here as additions to Data.List and see what kind of reaction I got: -- | The 'select' function takes a list and produces a list of pairs -- consisting of an element of the list together with a list of the -- remaining elements. select :: [a] -> [(a,[a])] select [] = [] select (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- select xs] -- | The 'selectSplit' function takes a list and produces a list of -- triples consisting of a prefix of the list, the element after it, -- and the remainder of the list. selectSplit :: [a] -> [([a],a,[a])] selectSplit [] = [] selectSplit (x:xs) = ([],x,xs) : [(x:lys,y,rys) | (lys,y,rys) <- selectSplit xs] These tend to be very handy in the list monad. The names are off the top of my head. Logan Capaldo also suggested 'pick' for the first, which is a name that I've used as well. Faxathisia mentioned that it's called 'select' in Prolog as well. As a side note, the state transformer makes it relatively easy to pick n elements using this: pick n = runStateT . replicateM n . StateT $ select (Showing that select is secretly an operation in a state transformed list monad, where the state is a list of elements.) If the order of the pairs in the MTL is fixed in the future in order to better reflect the available instances of Functor/etc., at that point we may want to consider flipping the pairs in the result of select to match. - Cale

Show replies by date

Neil Mitchell

15 Feb 15 Feb

10:02 a.m.

...

select :: [a] -> [(a,[a])] selectSplit :: [a] -> [([a],a,[a])]

Not entirely certain about the names, but the functions are good. Perhaps the reason I am dubious is the select/poll conventions. Is there not some word in combinatorics for extracting any single element out of a set? It sounds like Maths should have beaten us to this naming, as it typically does for Data.List. I have selectSplit defined in a few of my projects, and have called it "allItems" on at least one occasion - which is an even worse name. (+0.8) as it currently stands, but (+1) if it gets a better (more Mathsy) name. I don't support the flipping of the select arguments, ever, as currently it returns a very logical pair of ("this item","everything else") - flipping the order goes against the intution of the order of cons. Thanks Neil

Aaron Denney

10:14 a.m.

On 2008-02-15, Cale Gibbard wrote:

...

-- | The 'select' function takes a list and produces a list of pairs -- consisting of an element of the list together with a list of the -- remaining elements. select :: [a] -> [(a,[a])] select [] = [] select (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- select xs]

A couple tounge-in-cheek suggestions: "banishments" -- all possible ways a society can divide itself up into one banished person, and the rest of society. Or perhaps "hermits".

...

-- | The 'selectSplit' function takes a list and produces a list of -- triples consisting of a prefix of the list, the element after it, -- and the remainder of the list. selectSplit :: [a] -> [([a],a,[a])] selectSplit [] = [] selectSplit (x:xs) = ([],x,xs) : [(x:lys,y,rys) | (lys,y,rys) <- selectSplit xs]

I kind of want to call this "positionalPivots", because of the "pivoting element" used in dividing everything up into elements less than, and greater than a particular one. This is all possible choices, but it's using positional order, not comparison order to do this partioning. But I actually don't really see much utility in either of these. Do you have some examples of use, and how they're a commonly abstracted pattern that deserves using up namespace because of the utility of a common vocabulary? -- Aaron Denney -><-

Chris Kuklewicz

11:15 a.m.

Aaron Denney wrote:

...

But I actually don't really see much utility in either of these. Do you have some examples of use, and how they're a commonly abstracted pattern that deserves using up namespace because of the utility of a common vocabulary?

As mentioned in another reply, the select treats the source list like an unordered (perhaps multi) set. I was using something quite similar in a scrabble dictionary lookup to get all the ways to 'select' a tile from the current hand, returning [ (Tile,Remaining Hand)].

Johannes Waldmann

11:44 a.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Kuklewicz wrote:

...

As mentioned in another reply, the select treats the source list like an unordered (perhaps multi) set.

See - the proposed function is in the "wrong" module; it uses a concrete data type (List) which normally represents an abstract Sequence type but here some abstract data type (Bag) is intended. (my "favourite Haskell code smell": list obsession :-) Why then, are there so many of these functions? My theory is - because "lists are there" (since LISP) and there is (in Haskell) no agreement on those abstract types, and in fact implementations (Data.Set) change rather frequently, so programmers are hesitant to commit to them. But of course we'd only need an agreement on interfaces, then you'd not commit to one implementation. Again, Java did this right, they have interfaces for Set, Map. They even treat List as interface. Sure, they had fewer design choices as their type system has less freedom, just some special form of one-parameter type classes. But then, their syntax encourages the use of interface types - they are as easy to write as a class type. I'm not actually advocating to introduce more syntactic sugar to Haskell (for existential types), I think proper (refactoring) tool support would help a great deal already. Best regards, Johannes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHtXsV3ZnXZuOVyMIRAkl2AJ9CIxJoWM7d3UIl0s79E9d480dLVwCfeR0X gGdkZc/ioA80rwcTbTL8Pds= =X/pa -----END PGP SIGNATURE-----

Chris Kuklewicz

12:07 p.m.

Johannes Waldmann wrote:

...

Chris Kuklewicz wrote:

...
As mentioned in another reply, the select treats the source list like an unordered (perhaps multi) set.

See - the proposed function is in the "wrong" module; it uses a concrete data type (List) which normally represents an abstract Sequence type but here some abstract data type (Bag) is intended. (my "favourite Haskell code smell": list obsession :-)

Why then, are there so many of these functions?

Lists are the right thing for any small collection (scrabble hands only have 7 tiles at most). Sorted lists make great implementations of sets if you do not need the O(log n) member testing: union, intersection, difference are all trivial in O(n+m) time (and even findMin/deleteMin,minView; use Data.Seq instead of Data.List if you also need *max*). A list of [(item,count)] is then great model of multisets; this is what my scrabble program uses. I never need member testing -- I do need the 'select' operation, which got called pullTile in my code. Data.Set|Map would have been O(log 7) instead of O(7) in building the "remaining hand", but this savings is swamped by the constant factor when n=7.

...

My theory is - because "lists are there" (since LISP) and there is (in Haskell) no agreement on those abstract types, and in fact implementations (Data.Set) change rather frequently, so programmers are hesitant to commit to them.

...

But of course we'd only need an agreement on interfaces, then you'd not commit to one implementation. Again, Java did this right, they have interfaces for Set, Map. They even treat List as interface.

There are several third party libraries (e.g. Edison) that provide such for Haskell.

...

Sure, they had fewer design choices as their type system has less freedom, just some special form of one-parameter type classes.

The original collections were one-parameter type classes that stored "Object". Java then had to reinvent Collections using generics. And these are NOT simply one-parameter type classes. The first type parameter is always the first argument "this" in Java, which is the collection type (e.g. HashSet). The second type parameter is the type of the item being stored and retrieved (which acts sort-of-like a newtype--the storage is the same as "Object" to be compatible with old libraries).

...

But then, their syntax encourages the use of interface types - they are as easy to write as a class type.

Haskell needs MultiParamterTypeClasses to do a great job with a collections interface (type of collection and type of item). This is not Haskell98. These need FunctionalDependencies to be convenient -- which were never fully and happily embraced, even as an extensions. The Associated* extensions which are not yet stable will hopefully solve this.

...

I'm not actually advocating to introduce more syntactic sugar to Haskell (for existential types), I think proper (refactoring) tool support would help a great deal already.

Best regards, Johannes.

Conor McBride

12:32 p.m.

Hi On 15 Feb 2008, at 12:07, Chris Kuklewicz wrote:

...

Johannes Waldmann wrote:

...
Chris Kuklewicz wrote:

...
As mentioned in another reply, the select treats the source list like an unordered (perhaps multi) set. See - the proposed function is in the "wrong" module; it uses a concrete data type (List) which normally represents an abstract Sequence type but here some abstract data type (Bag) is intended.

[..]

...

Lists are the right thing for any small collection (scrabble hands only have 7 tiles at most).

I'm inclined to agree with you if by "thing" you mean "representation", but disagree if by "thing" you mean "type". If lists are being used to represent finite multisets, it's probably a good thing to pack them in a newtype, whether or not it's being treated as an abstract datatype. We should use types as signs of structure, not just as descriptions of data layout. For example, we should expect to have instance Eq x => Eq (Bag x) but not with the same semantics as equality on lists. What morals one can afford in terms of enforcing representation hiding is a trickier question, but it is at least relatively cheap to make type distinctions which are healthily suggestive. All the best Conor

Cale Gibbard

1:11 p.m.

On 15/02/2008, Johannes Waldmann wrote:

...

See - the proposed function is in the "wrong" module; it uses a concrete data type (List) which normally represents an abstract Sequence type but here some abstract data type (Bag) is intended. (my "favourite Haskell code smell": list obsession :-)

Why then, are there so many of these functions?

Lists are our loops. It has nothing to do with an aversion to using other sequence types. Lists reify linear recursion directly. So the real question I suppose is why loops are so popular. - Cale

Johannes Waldmann

18 Feb 18 Feb

1:15 p.m.

Cale Gibbard wrote:

...

Lists are our loops. It has nothing to do with an aversion to using other sequence types. Lists reify linear recursion directly. So the real question I suppose is why loops are so popular.

Well, loops are certainly popular in imperative programming, because there you have only one world, and it goes through a linear sequence of states. But we don't need that in functional programming - and if we're using "State" (or sequencing), then it's perhaps a (premature) optimization (towards the von-Neumann execution model). Anyway, I don't care much about library functions I'm probably not going to use; but maintainability is an issue, and I agree with Bulat there. Best regards, Johannes.

Cale Gibbard

4:03 p.m.

On 18/02/2008, Johannes Waldmann wrote:

...

Cale Gibbard wrote:

...
Lists are our loops. It has nothing to do with an aversion to using other sequence types. Lists reify linear recursion directly. So the real question I suppose is why loops are so popular.

Well, loops are certainly popular in imperative programming, because there you have only one world, and it goes through a linear sequence of states.

But we don't need that in functional programming - and if we're using "State" (or sequencing), then it's perhaps a (premature) optimization (towards the von-Neumann execution model).

I don't see this as the issue. State really has nothing to do with why loops are popular, at least in my mind. They come up a lot because they are the simplest possible kind of recursion, involving only one recursive call. You could think of the parameters to that recursive call as being the state if you like, but that seems artificial to me.

Isaac Dupree

15 Feb 15 Feb

10:11 p.m.

Aaron Denney wrote:

...

On 2008-02-15, Cale Gibbard wrote:

...
-- | The 'select' function takes a list and produces a list of pairs -- consisting of an element of the list together with a list of the -- remaining elements. select :: [a] -> [(a,[a])] select [] = [] select (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- select xs]

A couple tounge-in-cheek suggestions:

"banishments" -- all possible ways a society can divide itself up into one banished person, and the rest of society.

thank you for that name!!! It finally helped me figure out what the function was supposed to do. now, documentation that included examples would be nearly as good... select [1,2,3,4] = [(1,[2,3,4]), (2,[1,3,4]), (3,[1,2,4]), (4,[1,2,3])] --right? or we could call it "howToAbductEachElement" ;-) but that belongs in the Haskell list library even less than "banishments"

Neil Mitchell

16 Feb 16 Feb

11:50 a.m.

...

now, documentation that included examples would be nearly as good... select [1,2,3,4] = [(1,[2,3,4]), (2,[1,3,4]), (3,[1,2,4]), (4,[1,2,3])]

For list examples, its often clearer to use type Char select "1234" = [('1',"234"), ... Thanks Neil

Conor McBride

15 Feb 15 Feb

10:47 a.m.

Hi [statutory math warning: calculus, comonads]

...

On 15 Feb 2008, at 03:15, Cale Gibbard wrote:

...
I know Bulat will be terribly disappointed by my suggestion to make an addition to Data.List ;) but I mentioned the following couple of functions on #haskell and got some positive response that they were things other people ended up writing all the time as well, so I decided I'd propose them here as additions to Data.List and see what kind of reaction I got:

These selectors ring a bell with me. In Nottingham, Morningside, and a few other places, they're known as instances of "Hancock's cursor down operator" as documented here: http://sneezy.cs.nott.ac.uk/containers/blog/?p=20 More Haskellily than that blog article, the general form is (more or less, and therein lies another story...) down :: Deriv f f' => f x -> f (x, f' x) where Deriv is the usual differential operator calculating from a container its type of one-hole contexts. The effect of down is to decorate each x-element of an (f x)-structure with its own context, thus calculating the collection of ways in which one can visit an element, zipper style. The name makes sense in the context where you're navigating some tree structure given by the fixpoint of f, but "select" is probably a less loaded choice. This generalisation may be too far away from what looks like (and is) a handy pair of list functions, so I wouldn't blame anyone from adding them to the library largely as is. I just thought I'd mention the wedge of which they are the thin end. Second things first:

...

-- | The 'selectSplit' function takes a list and produces a list of -- triples consisting of a prefix of the list, the element after it, -- and the remainder of the list. selectSplit :: [a] -> [([a],a,[a])] selectSplit [] = [] selectSplit (x:xs) = ([],x,xs) : [(x:lys,y,rys) | (lys,y,rys) <- selectSplit xs]

This guy corresponds to instance Deriv [] (Prod [] []) where newtype Prod f g x = Prod (f x, g x) That is, a one-hole context in a list is a pair of lists---the elements before the hole, the elements after the hole.

...

-- | The 'select' function takes a list and produces a list of pairs -- consisting of an element of the list together with a list of the -- remaining elements. select :: [a] -> [(a,[a])] select [] = [] select (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- select xs]

This one, as confirmed by its example usage

...

As a side note, the state transformer makes it relatively easy to pick n elements using this: pick n = runStateT . replicateM n . StateT $ select

is somehow seeing lists as *bags*, ie, finite multisets, or lists under arbitrary permutation. newtype Bag x = Bag [x] The power series expansion of Bag is an old pal: Bag x = 1 + x + x^2/2! + x^3/3! + ... representing the choice of an n-tuple, but not distingushing the n! ways in which it can be ordered. Correspondingly instance Deriv Bag Bag a one-hole context in a jumbled collection of elements is a jumbled collection of the remaining elements.

...

If the order of the pairs in the MTL is fixed in the future in order to better reflect the available instances of Functor/etc., at that point we may want to consider flipping the pairs in the result of select to match.

It's tempting to suggest something like data Deriv f f' => Selection f f' x = x :@ f' x and then select :: Deriv f f' => f x -> f (Selection f f' x) but it may be too grotty for normal use. My motivation is to get my hands on the fact that Deriv f f' => Comonad (Selection f f') where the comonad operations make sense like this: counit :: Selection f f' x -> x returns the selected element cojoin :: Selection f f' x -> Selection f f' (Selection f f' x) decorates each element in the selection with the context in which *it* would be selected, showing you your possible sideways moves (including staying put, ie, decorating the selected element with its existing context). So, the original plug :: Selection f f' x -> f x is "up", cojoin is "sideways" and select is "down". Er, um, applications of these things? I have some. You can find one in a draft I must get around to putting words in: http://strictlypositive.org/Holes.pdf http://strictlypositive.org/Holes.lhs It's a bit enigmatic at the moment, but basically Lucas Dixon suggested to me that (as people who do a lot of substitution) we might benefit from making it quicker to jump over large closed chunks of terms: from the root of a term, he wanted, in constant time to get one of: (0) a guarantee the term is closed (1) the position of its only variable (2) the node where paths to at least two variables diverge The derivative allows us to build a representation of terms where the root is connected to other interesting points by "tubes": lists of one-hole contexts through closed stuff. The compression algorithm which makes an ordinary term into a tube network makes key usage of Hancock's "down" operator: we're looking for the "all subterms closed but one" pattern, so we want to search the possible decompositions of each node. Perhaps "differentiate" is the right name? Oops. I seem to have got a bit carried away. Apologies Conor

Henning Thielemann

12:53 p.m.

On Thu, 14 Feb 2008, Cale Gibbard wrote:

...

I know Bulat will be terribly disappointed by my suggestion to make an addition to Data.List ;) but I mentioned the following couple of functions on #haskell and got some positive response that they were things other people ended up writing all the time as well, so I decided I'd propose them here as additions to Data.List and see what kind of reaction I got:

-- | The 'select' function takes a list and produces a list of pairs -- consisting of an element of the list together with a list of the -- remaining elements. select :: [a] -> [(a,[a])] select [] = [] select (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- select xs]

-- | The 'selectSplit' function takes a list and produces a list of -- triples consisting of a prefix of the list, the element after it, -- and the remainder of the list. selectSplit :: [a] -> [([a],a,[a])] selectSplit [] = [] selectSplit (x:xs) = ([],x,xs) : [(x:lys,y,rys) | (lys,y,rys) <- selectSplit xs]

I use these functions regularly, too. select c = init $ zipWith (\xs (y:ys) -> (y, xs++ys)) (List.inits c) (List.tails c)

6354

Age (days ago)

6357

Last active (days ago)

List overview

Download

13 comments

8 participants

participants (8)

Aaron Denney
Cale Gibbard
Chris Kuklewicz
Conor McBride
Henning Thielemann
Isaac Dupree
Johannes Waldmann
Neil Mitchell