Re: [GHC] #971: Add intercalate and split to Data.List

30 Oct 2006

      "Josef Svenningsson"  writes:
...
On 10/30/06, Christian Maeder  wrote:
...
I also agree to add a split function. The problem was to agree which
variant should be called split (wrt. to treatment of multiple and final
separators as well as to "Eq" context or predicate argument).
http://article.gmane.org/gmane.comp.lang.haskell.libraries/4862
http://article.gmane.org/gmane.comp.lang.haskell.libraries/4869
Right. I had totally forgotten about that discussion. Was there any
consensus about the names?
I'm not sure there was yet consensus about the functions --
it seems I tried to have the bikeshed demolished, rebuilt
as a pagoda and painted red:

http://article.gmane.org/gmane.comp.lang.haskell.cafe/13807

while there's some merit in ending discussion and deciding
on /something/, I reckon it's worth trying to think about
laws and such for a while yet. 

It strikes me that the discussion suggests that there are
several distinct design cases: does a separator at the
end/do two consecutive separators result in empty elements?
Do the separators form part of the output?

To sloganise for a moment, libraries should provide
components, not finished products.  So what we should aim
for is a collection of functions that facilitate the
construction of all the design cases.

One of the components /might/ be:
...
spanMb p [] = Nothing
spanMb p l = Just $ span p l
As to nomenclature, since we already have span, I'd suggest
spans p l should return a list of all the contiguous
segments of l with members satisfying p (and discard the
rest).

Here's a definition:
...
spans p = unfoldr (fmap (id >< dropWhile (not . p)) . spanMb p)
(f >< g) (a,b) = (f a, g b)
We can write “groupBy (\a b -> p a && p b)” for another of
the design cases, though I think some discussion of groupBy
belongs in here: why does it require an equivalence
relation? Shouldn't we at least have a version that works
for any binary predicate?

Here's an off the top of my head ugly version:
...
runsBy = unfoldr . splitRun
...
where splitRun p [] = Nothing
               splitRun p [x] = Just ([x],[])
               splitRun p (a:rest@(b:_)) 
                    | p a b = fmap ((a:) >< id) (splitRun p rest)
                    | otherwise = Just ([a], rest)
(maybe splitRun is generally useful?)

-- 
J�n Fairbairn                                 Jon.Fairbairn@cl.cam.ac.uk

Re: [GHC] #971: Add intercalate and split to Data.List

Jón Fairbairn