Proposal: Add filterM or filterA to Data.Sequence

This can be given exactly the same implementation as the one for lists: filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a) filterM p = foldr go (pure empty) where go x r = f <$> p x <*> r where f flg ys = if flg then x <| ys else ys Bikeshed all you like over the name.

+1 On 28.12.2014 06:22, David Feuer wrote:
This can be given exactly the same implementation as the one for lists:
filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a) filterM p = foldr go (pure empty) where go x r = f <$> p x <*> r where f flg ys = if flg then x <| ys else ys
Bikeshed all you like over the name.
-- Andreas Abel <>< Du bist der geliebte Mensch. Department of Computer Science and Engineering Chalmers and Gothenburg University, Sweden andreas.abel@gu.se http://www2.tcs.ifi.lmu.de/~abel/

+1 to just generalizing filterM in Data.Sequence
On Sun, Dec 28, 2014 at 12:22 AM, David Feuer
This can be given exactly the same implementation as the one for lists:
filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a) filterM p = foldr go (pure empty) where go x r = f <$> p x <*> r where f flg ys = if flg then x <| ys else ys
Bikeshed all you like over the name. _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

We should check that `filterM . toList` isn't as fast. That was the case
for bytestring and we rejected adding filterM there for that reason.
On Mon, Dec 29, 2014 at 6:13 AM, Edward Kmett
+1 to just generalizing filterM in Data.Sequence
On Sun, Dec 28, 2014 at 12:22 AM, David Feuer
wrote: This can be given exactly the same implementation as the one for lists:
filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a) filterM p = foldr go (pure empty) where go x r = f <$> p x <*> r where f flg ys = if flg then x <| ys else ys
Bikeshed all you like over the name. _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

I’m not sure I understand. Do you mean that |filterM| shouldn’t exist for data structures for which |filterM . toList| is as fast? If this is the case, I wish that it was at least specified in the documentation that e.g. “this function doesn’t exist because the naive composition is guaranteed to be optimised away / the faster version is actually impossible to write / etc.”. I find myself wondering way too often whether some piece of code I’ve written is a potential candidate for optimisation, and knowing in advance that the naive version is the “recommended” approach lets me not waste my time on benchmarking code which was already benchmarked by others. On 12/30/2014 08:05 PM, Johan Tibell wrote:
We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.
On Mon, Dec 29, 2014 at 6:13 AM, Edward Kmett
mailto:ekmett@gmail.com> wrote: +1 to just generalizing filterM in Data.Sequence
On Sun, Dec 28, 2014 at 12:22 AM, David Feuer
mailto:david.feuer@gmail.com> wrote: This can be given exactly the same implementation as the one for lists:
filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a) filterM p = foldr go (pure empty) where go x r = f <0.3927809241601645gt; p x <*> r where f flg ys = if flg then x <| ys else ys
Bikeshed all you like over the name. _______________________________________________ Libraries mailing list Libraries@haskell.org mailto:Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org mailto:Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

The larger point is this: should we have applicative functions for every
higher order function, for every data type? For example, one could make an
argument for having Map.alterA in addition to Map.alter, because without
alterA one would have to use a combination of lookup and insert, which
could be slower, to achieve the same effect.
The obvious downside is the explosion of functions in the API, which is
even worse due to their already being lazy and strict versions of most
higher-order function (i.e. now we'd have to have 2*2 versions of every
function). This seems like a failure in composability and abstraction.
Until we've figured out a way to deal with this general issue that doesn't
involve duplicating tons of code and swelling the API, I've been pushing
back on changes like this.
In this particular case, having to go via lists might hurt performance a
bit, but might still be better than the other alternative.
As one of the maintainers of the containers package these are the kind of
issues I have to consider*.
* This is by the way one of the reasons that it's important for packages to
have dedicated maintainers, so make sure proposed changes are considered in
the larger context of the health and evolution of the package as a whole.
On Tue, Dec 30, 2014 at 12:52 PM, Artyom
I’m not sure I understand. Do you mean that |filterM| shouldn’t exist for data structures for which |filterM . toList| is as fast?
If this is the case, I wish that it was at least specified in the documentation that e.g. “this function doesn’t exist because the naive composition is guaranteed to be optimised away / the faster version is actually impossible to write / etc.”. I find myself wondering way too often whether some piece of code I’ve written is a potential candidate for optimisation, and knowing in advance that the naive version is the “recommended” approach lets me not waste my time on benchmarking code which was already benchmarked by others.
On 12/30/2014 08:05 PM, Johan Tibell wrote:
We should check that `filterM . toList` isn't as fast. That was the case
for bytestring and we rejected adding filterM there for that reason.
On Mon, Dec 29, 2014 at 6:13 AM, Edward Kmett
> wrote: +1 to just generalizing filterM in Data.Sequence
On Sun, Dec 28, 2014 at 12:22 AM, David Feuer
mailto:david.feuer@gmail.com> wrote: This can be given exactly the same implementation as the one for lists:
filterM :: (Applicative f) => (a -> f Bool) -> Seq a -> f (Seq a) filterM p = foldr go (pure empty) where go x r = f <0.3927809241601645gt; p x <*> r where f flg ys = if flg then x <| ys else ys
Bikeshed all you like over the name. _______________________________________________ Libraries mailing list Libraries@haskell.org mailto:Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org mailto:Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Tue, Dec 30, 2014 at 1:05 PM, Johan Tibell
The obvious downside is the explosion of functions in the API, which is even worse due to their already being lazy and strict versions of most higher-order function (i.e. now we'd have to have 2*2 versions of every function). This seems like a failure in composability and abstraction. Until we've figured out a way to deal with this general issue that doesn't involve duplicating tons of code and swelling the API, I've been pushing back on changes like this.
1. I think it's better to duplicate code in a library than to make its users duplicate the code themselves. 2. One option would be to make Data.Sequence.Private, exposing all the types and various internals as well as defining all the necessary instances, and then to add Data.Sequence.This, Data.Sequence.That, etc.

On Tue, Dec 30, 2014 at 1:21 PM, David Feuer
On Tue, Dec 30, 2014 at 1:05 PM, Johan Tibell
wrote: The obvious downside is the explosion of functions in the API, which is even worse due to their already being lazy and strict versions of most higher-order function (i.e. now we'd have to have 2*2 versions of every function). This seems like a failure in composability and abstraction. Until we've figured out a way to deal with this general issue that doesn't involve duplicating tons of code and swelling the API, I've been pushing back on changes like this.
1. I think it's better to duplicate code in a library than to make its users duplicate the code themselves.
I tend to disagree, especially if we're not talking about duplicating filterMSeq f = fmap fromList . filterM f . toList but the whole implementation of Seq.filter (which would need to if you want an optimal implementation of filterMSeq).
2. One option would be to make Data.Sequence.Private, exposing all the types and various internals as well as defining all the necessary instances, and then to add Data.Sequence.This, Data.Sequence.That, etc.
I don't want to expose the internals (we don't do that for other containers types) because we wouldn't, in practice, be able to change the implementation if we wanted to.

On Wed, Dec 31, 2014 at 1:05 AM, Johan Tibell
As one of the maintainers of the containers package these are the kind of issues I have to consider*.
I share these concerns too with Johan, especially about code duplication and package maintenance. Kudos to him for stewarding it so successfully all along for I and countless others appreciate his work. And I think it's unfair to push work around like this. What happens is open source fatigue and you find your best contributors leaving. Traditionally, if you found some missing function, you would roll your own package and upload it for the benefit of everyone. The proper resolution of this proposal is Data.MissingSeq.filterM. -- Kim-Ee

On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell
We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.
Let me expand that definition to what it really looks like: filterMSeq f = fmap fromList . filterM f . toList Yes, you can do this. Yes, it's probably even pretty efficient. But it's a bit painful to have to expand things out like this just to switch a bit of code from lists to sequences.

Hi all,
-----Original message----- From: David Feuer
Sent: 30 Dec 2014, 13:15 On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell
wrote: We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.
Let me expand that definition to what it really looks like:
filterMSeq f = fmap fromList . filterM f . toList
Yes, you can do this. Yes, it's probably even pretty efficient. But it's a bit painful to have to expand things out like this just to switch a bit of code from lists to sequences.
the problem with this is that there are really many functions that "would be useful", but we do not really want to add them all. A general rule I have been using is "can this be implemented straightforwardly using existing methods without hurting performance much"? So I would like to see the numbers, please. Also, filterM is not yet settled in base, so we should wait some time until it is. Cheers, Milan PS: It is a shame we cannot reuse Foldable and/or Traversable for filtering. Maybe we should add a new Filterable class?

There won't be any such numbers in this case, because access to the
details of the sequence structure does not seem to help implement this
function. The same is true of the following, all of which already
exist:
replicate, replicateM, iterateN -- all are implemented using replicateA
unfoldr, unfoldl --implemented using <| and |>
null -- can be written just as well using null xs = case viewl xs of
{EmptyL -> True; _ -> False}
scanl, scanr, scanl1, scanr1 -- defined ultimately in terms of traverse
spanl, spanr, breakl, breakr, takeWhile, dropWhile -- defined in terms
of splitAt
findIndicesL, findIndicesR -- defined in terms of foldlWithIndex, foldrWithIndex
take, drop -- previously defined using splitAt; currently defined
using splitAt', but those two functions are likely to merge
filter -- uses foldl
elemIndexL, elemIndicesL, elemIndexR, elemIndicesR, findIndexL,
findIndexR -- all defined using findIndicesL and findIndicesR
update -- uses adjust
zip, zipWith3, zip3, zipWith4, zip4 -- All defined from zipWith
I think almost all of these functions are good to have around
(although the elemIndex family seems a bit silly). Adding a few more
to round things out seems reasonable to me as well.
On Tue, Dec 30, 2014 at 5:41 PM, Milan Straka
Hi all,
-----Original message----- From: David Feuer
Sent: 30 Dec 2014, 13:15 On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell
wrote: We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.
Let me expand that definition to what it really looks like:
filterMSeq f = fmap fromList . filterM f . toList
Yes, you can do this. Yes, it's probably even pretty efficient. But it's a bit painful to have to expand things out like this just to switch a bit of code from lists to sequences.
the problem with this is that there are really many functions that "would be useful", but we do not really want to add them all.
A general rule I have been using is "can this be implemented straightforwardly using existing methods without hurting performance much"? So I would like to see the numbers, please.
Also, filterM is not yet settled in base, so we should wait some time until it is.
Cheers, Milan
PS: It is a shame we cannot reuse Foldable and/or Traversable for filtering. Maybe we should add a new Filterable class?

Hi all,
-----Original message----- From: David Feuer
Sent: 30 Dec 2014, 18:03 There won't be any such numbers in this case, because access to the details of the sequence structure does not seem to help implement this function.
sorry for not making clear -- I was interested in difference between the mentioned | filterMSeq f = fmap fromList . filterM f . toList and your foldr-based implementation. I can imagine that one can use filterMSeq in place instead of filterM, so that is why I am interested between filterMSeq and the specialized implementation (which is just a plain foldr + <|).
The same is true of the following, all of which already exist:
replicate, replicateM, iterateN -- all are implemented using replicateA unfoldr, unfoldl --implemented using <| and |> null -- can be written just as well using null xs = case viewl xs of {EmptyL -> True; _ -> False} scanl, scanr, scanl1, scanr1 -- defined ultimately in terms of traverse spanl, spanr, breakl, breakr, takeWhile, dropWhile -- defined in terms of splitAt findIndicesL, findIndicesR -- defined in terms of foldlWithIndex, foldrWithIndex take, drop -- previously defined using splitAt; currently defined using splitAt', but those two functions are likely to merge filter -- uses foldl elemIndexL, elemIndicesL, elemIndexR, elemIndicesR, findIndexL, findIndexR -- all defined using findIndicesL and findIndicesR update -- uses adjust zip, zipWith3, zip3, zipWith4, zip4 -- All defined from zipWith
I think almost all of these functions are good to have around (although the elemIndex family seems a bit silly). Adding a few more to round things out seems reasonable to me as well.
true, that is why it is a general rule (to not add functions which can be implemented using others), not an absolute rule. The question is whether filterM is as "basic" method used frequently enough (for example, no one obviously questions zip). Note that it is not obvious what type does filterM has (unlike e.g. zip), now that we have Applicative and also filterM is changing in base. From this point of view, it makes sense to use fmap fromList . filterM f . toList because then we do not have inconsistent filterM-s. Cheers, Milan
On Tue, Dec 30, 2014 at 5:41 PM, Milan Straka
wrote: Hi all,
-----Original message----- From: David Feuer
Sent: 30 Dec 2014, 13:15 On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell
wrote: We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.
Let me expand that definition to what it really looks like:
filterMSeq f = fmap fromList . filterM f . toList
Yes, you can do this. Yes, it's probably even pretty efficient. But it's a bit painful to have to expand things out like this just to switch a bit of code from lists to sequences.
the problem with this is that there are really many functions that "would be useful", but we do not really want to add them all.
A general rule I have been using is "can this be implemented straightforwardly using existing methods without hurting performance much"? So I would like to see the numbers, please.
Also, filterM is not yet settled in base, so we should wait some time until it is.
Cheers, Milan
PS: It is a shame we cannot reuse Foldable and/or Traversable for filtering. Maybe we should add a new Filterable class?

On Tue, Dec 30, 2014 at 6:22 PM, Milan Straka
Hi all,
-----Original message----- From: David Feuer
Sent: 30 Dec 2014, 18:03 There won't be any such numbers in this case, because access to the details of the sequence structure does not seem to help implement this function.
sorry for not making clear -- I was interested in difference between the mentioned | filterMSeq f = fmap fromList . filterM f . toList and your foldr-based implementation. I can imagine that one can use filterMSeq in place instead of filterM, so that is why I am interested between filterMSeq and the specialized implementation (which is just a plain foldr + <|).
Oh, I really don't know which is better (and that's true also for unfoldr and unfoldl--is the current implementation better or worse than using fromList?).
Note that it is not obvious what type does filterM has (unlike e.g. zip), now that we have Applicative and also filterM is changing in base. From this point of view, it makes sense to use fmap fromList . filterM f . toList because then we do not have inconsistent filterM-s.
I have no problem with having filterM in Data.Sequence be more general than filterM in Control.Monad for old versions. I think both unfold(r/l) and filterM actually make a potential point in my favor—if there are multiple implementations that seem reasonable, having whichever turns out to be fastest in the library will save users time benchmarking them. David

On 2014-12-30 23:41, Milan Straka wrote:
Hi all,
-----Original message----- From: David Feuer
Sent: 30 Dec 2014, 13:15 On Tue, Dec 30, 2014 at 12:05 PM, Johan Tibell
wrote: We should check that `filterM . toList` isn't as fast. That was the case for bytestring and we rejected adding filterM there for that reason.
Let me expand that definition to what it really looks like:
filterMSeq f = fmap fromList . filterM f . toList
Yes, you can do this. Yes, it's probably even pretty efficient. But it's a bit painful to have to expand things out like this just to switch a bit of code from lists to sequences.
the problem with this is that there are really many functions that "would be useful", but we do not really want to add them all.
A general rule I have been using is "can this be implemented straightforwardly using existing methods without hurting performance much"? So I would like to see the numbers, please.
Well, there's an opposing point called "usability". If there are already "obvious" functions to use in other data structure modules then they should *probably* also be represented in the other data structure modules. (I don't claim that this is necessarily a "trump", but I think it bears consideration.) Regards,
participants (8)
-
Andreas Abel
-
Artyom
-
Bardur Arantsson
-
David Feuer
-
Edward Kmett
-
Johan Tibell
-
Kim-Ee Yeoh
-
Milan Straka