Re: Proposal for Data.List.splitBy

18 Jan 2009

      Duncan Coutts wrote:
...
On Sun, 2009-01-18 at 12:02 +0100, Marcus D. Gabriel wrote:
...
Brent Yorgey wrote:
...
...
P2. There should be no information loss, that is, keep the
delimiters,
...
...
keep the separators, keep the parts of the original list xs that
satisfy
...
...
a predicate p, do not lose information about the beginning and the
end
...
...
of the list relative to the first and last elements of the list
respectively. The user of the function decides what to discard.
P3. A split list should be unsplittable so as to recover the original
list xs. (I made up the word unsplittable.) (P2 implies P3, but let us
state this anyway.)
I'm not sure I agree with this.
Thanks for stating this.  Dropping P3 would change my
thinking about this topic, that is, if we drop P3, then
I would prefer that no splitter functions are added to
Data.List and that it is left as is.
...
The problem is that much (most?) of
the time, people looking for a split function want to discard
delimiters; for example, if you have a string like "foo;bar;baz" and
you want to split it into ["foo","bar","baz"].
I agree with this comment when thinking about strings and what
I would do most of the time and from a pragmatic point of view.
Indeed, the existing Data.List.words is certainly lossyand deliberately
so. It's also useful and widely used.
On the other hand it is a widely held view that Data.List.lines should
not be lossy, ie that Data.List.unlines . Data.List.lines  should be the
identity. In the current implementation of lines . unlines it is not the
case because of the way it handles a trailing newline.
Duncan
An argument for not placing any fundamental splitter functions
in Data.List that are lossy if I ever read one.

The user of these functions should explicitly choose to lose
information.  Then the documentation in the Haskell 98 report
might have stated instead something like

  unlines . lines == id iff xs ends with '\n'

which would at least be up front.

Cheers,
- Marcus