Re: [GHC] #971: Add intercalate and split to Data.List

= Split = Bulat Ziganshin suggested also adding a function split which is a right inverse of intercalate. Donald Steward seemed to agree with this decision. No one else raised their voice about this so it seems uncontroversial.
I also agree to add a split function. The problem was to agree which variant should be called split (wrt. to treatment of multiple and final separators as well as to "Eq" context or predicate argument). http://article.gmane.org/gmane.comp.lang.haskell.libraries/4862 http://article.gmane.org/gmane.comp.lang.haskell.libraries/4869

On 10/30/06, Christian Maeder
= Split = Bulat Ziganshin suggested also adding a function split which is a right inverse of intercalate. Donald Steward seemed to agree with this decision. No one else raised their voice about this so it seems uncontroversial.
I also agree to add a split function. The problem was to agree which variant should be called split (wrt. to treatment of multiple and final separators as well as to "Eq" context or predicate argument).
http://article.gmane.org/gmane.comp.lang.haskell.libraries/4862 http://article.gmane.org/gmane.comp.lang.haskell.libraries/4869
Right. I had totally forgotten about that discussion. Was there any consensus about the names? Josef

"Josef Svenningsson"
On 10/30/06, Christian Maeder
wrote: I also agree to add a split function. The problem was to agree which variant should be called split (wrt. to treatment of multiple and final separators as well as to "Eq" context or predicate argument).
http://article.gmane.org/gmane.comp.lang.haskell.libraries/4862 http://article.gmane.org/gmane.comp.lang.haskell.libraries/4869
Right. I had totally forgotten about that discussion. Was there any consensus about the names?
I'm not sure there was yet consensus about the functions -- it seems I tried to have the bikeshed demolished, rebuilt as a pagoda and painted red: http://article.gmane.org/gmane.comp.lang.haskell.cafe/13807 while there's some merit in ending discussion and deciding on /something/, I reckon it's worth trying to think about laws and such for a while yet. It strikes me that the discussion suggests that there are several distinct design cases: does a separator at the end/do two consecutive separators result in empty elements? Do the separators form part of the output? To sloganise for a moment, libraries should provide components, not finished products. So what we should aim for is a collection of functions that facilitate the construction of all the design cases. One of the components /might/ be:
spanMb p [] = Nothing spanMb p l = Just $ span p l
As to nomenclature, since we already have span, I'd suggest spans p l should return a list of all the contiguous segments of l with members satisfying p (and discard the rest). Here's a definition:
spans p = unfoldr (fmap (id >< dropWhile (not . p)) . spanMb p) (f >< g) (a,b) = (f a, g b)
We can write “groupBy (\a b -> p a && p b)” for another of the design cases, though I think some discussion of groupBy belongs in here: why does it require an equivalence relation? Shouldn't we at least have a version that works for any binary predicate? Here's an off the top of my head ugly version:
runsBy = unfoldr . splitRun
where splitRun p [] = Nothing splitRun p [x] = Just ([x],[]) splitRun p (a:rest@(b:_)) | p a b = fmap ((a:) >< id) (splitRun p rest) | otherwise = Just ([a], rest)
(maybe splitRun is generally useful?) -- J�n Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Jón Fairbairn
spans p = unfoldr (fmap (id >< dropWhile (not . p)) . spanMb p) (f >< g) (a,b) = (f a, g b)
We can write “groupBy (\a b -> p a && p b)” for another of the design cases, though I think some discussion of groupBy belongs in here: why does it require an equivalence relation? Shouldn't we at least have a version that works for any binary predicate?
I just noticed that Ross Paterson said much the same thing on Haskell Café this morning. And that I stopped before I'd indicated why runsBy is useful here...
runsBy = unfoldr . splitRun [...]
... so we get runsBy (const (/=',')) ",foo,,bar,baz" ==> [",foo",",",",bar",",baz"] and
spans p = map (dropWhile (not . p)) . runsBy (const p)
-- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

OK. This suggests that there is a lot more to think about and discuss
when it comes to split. I'm going to remove split from my patch. It
should have its own proposal and I'll leave it for someone else to get
that ball rolling.
Thanks for the information.
Josef
On 30 Oct 2006 18:25:43 +0000, Jón Fairbairn
"Josef Svenningsson"
writes: On 10/30/06, Christian Maeder
wrote: I also agree to add a split function. The problem was to agree which variant should be called split (wrt. to treatment of multiple and final separators as well as to "Eq" context or predicate argument).
http://article.gmane.org/gmane.comp.lang.haskell.libraries/4862 http://article.gmane.org/gmane.comp.lang.haskell.libraries/4869
Right. I had totally forgotten about that discussion. Was there any consensus about the names?
I'm not sure there was yet consensus about the functions -- it seems I tried to have the bikeshed demolished, rebuilt as a pagoda and painted red:
http://article.gmane.org/gmane.comp.lang.haskell.cafe/13807
while there's some merit in ending discussion and deciding on /something/, I reckon it's worth trying to think about laws and such for a while yet.
It strikes me that the discussion suggests that there are several distinct design cases: does a separator at the end/do two consecutive separators result in empty elements? Do the separators form part of the output?
To sloganise for a moment, libraries should provide components, not finished products. So what we should aim for is a collection of functions that facilitate the construction of all the design cases.
One of the components /might/ be:
spanMb p [] = Nothing spanMb p l = Just $ span p l
As to nomenclature, since we already have span, I'd suggest spans p l should return a list of all the contiguous segments of l with members satisfying p (and discard the rest).
Here's a definition:
spans p = unfoldr (fmap (id >< dropWhile (not . p)) . spanMb p) (f >< g) (a,b) = (f a, g b)
We can write "groupBy (\a b -> p a && p b)" for another of the design cases, though I think some discussion of groupBy belongs in here: why does it require an equivalence relation? Shouldn't we at least have a version that works for any binary predicate?
Here's an off the top of my head ugly version:
runsBy = unfoldr . splitRun
where splitRun p [] = Nothing splitRun p [x] = Just ([x],[]) splitRun p (a:rest@(b:_)) | p a b = fmap ((a:) >< id) (splitRun p rest) | otherwise = Just ([a], rest)
(maybe splitRun is generally useful?)
-- J�n Fairbairn Jon.Fairbairn@cl.cam.ac.uk
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Mon, 30 Oct 2006, Jón Fairbairn
We can write “groupBy (\a b -> p a && p b)” [...]
You mean “groupBy ((&&) `on` p)”, right? ;) Do you consider on to be above or below the Fairbairn threshold, by the way? I think it is above the threshold since * we get rid of two lambdas, * we get rid of the duplication of p, * and, most importantly, it is easier at a glance to tell what the function does (assuming one knows about on). -- /NAD

On 2006-11-02 at 15:28+0100 Nils Anders Danielsson wrote:
On Mon, 30 Oct 2006, Jón Fairbairn
wrote: We can write “groupBy (\a b -> p a && p b)” [...]
You mean “groupBy ((&&) `on` p)”, right? ;)
Something very much like that!
Do you consider on to be above or below the Fairbairn threshold, by the way?
Above. It's hard to articulate, but the reasons you give for `on` seem about right, so long as on is defined at a general enough type (and it's hard to beat forall a b c. (a -> a -> b) -> (c -> a) -> c -> t -> b for generality!) To explore this a bit further with a different example, I would have been in favour of introducing (f >< g) (a, b) = (f a, f b) but not mapFst f (a, b) = (f a, b) because the latter is simply (f >< id), which has a nice "parallel" look to it; it seems easier to me to see that (f >< id) . (id >< g) = (f >< g) than that mapFst f . mapSnd g = (f >< g) Now, though, there's &&& in Control.Arrow, and that subsumes
<, so I'm nolonger enthusiastic about (><) (though I might define it locally in circumstances where I think my readers would be unfamiliar with &&&).
One of the considerations is that once you put a name into a library, you'll have the devil's own job getting it out again if you decide it was a mistake. Jón -- Jón Fairbairn Jon.Fairbairn at cl.cam.ac.uk

On Thu, 02 Nov 2006, Nils Anders Danielsson
Do you consider on to be above or below the Fairbairn threshold, by the way? I think it is above the threshold since * we get rid of two lambdas, * we get rid of the duplication of p, * and, most importantly, it is easier at a glance to tell what the function does (assuming one knows about on).
And now Ulf is in the process of proving that flip on is a functor from C to C^op for any CCC C, with flip on defined as follows: flip_on_A(B) = (A^B)^B flip_on_A(f) = curry (curry (eval ∘ id × f) ∘ eval ∘ id × f) Ulf might follow up with a proof. :) -- /NAD

On Thu, Nov 02, 2006 at 05:35:22PM +0100, Nils Anders Danielsson wrote:
And now Ulf is in the process of proving that flip on is a functor from C to C^op for any CCC C, with flip on defined as follows:
flip_on_A(B) = (A^B)^B flip_on_A(f) = curry (curry (eval ??? id × f) ??? eval ??? id × f)
It's the continuation-passing version of (^2) : C -> C.
participants (6)
-
Christian Maeder
-
Jon Fairbairn
-
Josef Svenningsson
-
Jón Fairbairn
-
Nils Anders Danielsson
-
Ross Paterson