
Duncan Coutts wrote:
On Sun, 2009-01-04 at 23:25 +0100, Henning Thielemann wrote:
On Sun, 4 Jan 2009, Marcus D. Gabriel wrote:
We seem to have not added any splitters such as splitBy or split to Data.List despite the discussions. Here is yet another perspective with the hope that it will make it.
I assumed that the discussion was resolved by http://hackage.haskell.org/cgi-bin/hackage-scripts/package/split
We should say that the split package is a way of continuing the discussion. The hope of some around here is that we can find a consensus on a small set of common split functions and add them to the Data.List module. I'm not sure if that goal is achievable but I think it's worth trying.
Duncan
From the e-mail threads, I would say the above are the absolute minimum
Let me see if I can explain what I was trying to bring to the discussion. Forget the code examples that I gave for now. You should realize that I am someone who was reading the discussion threads and felt that the group had the answer or answers, that there was no fighting going on as someone wrote somewhere, and that the key properties that the splitter functions should posses is contained in the e-mail threads but not grouped together. In other words, it was a nice discussion to read. So, let me formulate this from my point of view of how this comes together as I have read it. 1/ It's worth having some splitter functions in the Data.List module, so this is the goal. Otherwise, one leaves Data.List alone and the discussion should be whether or not one should simply use a Data.List.Split package to handle a set of useful idioms somewhere between Data.List and Text.ParserCombinators.Parsec in complexity. Possibly one would flush out Data.List.Split and even write it with a pedantic theme for new comers to Haskell for example. 2/ Some properties, in a loose sense of the term, are needed to constrain the choices and from the e-mail threads, I would say that this sums up as the following. P1. The number of splitter functions should be few in number, otherwise, make a package. P2. There should be no information loss, that is, keep the delimiters, keep the separators, keep the parts of the original list xs that satisfy a predicate p, do not lose information about the beginning and the end of the list relative to the first and last elements of the list respectively. The user of the function decides what to discard. P3. A split list should be unsplittable so as to recover the original list xs. (I made up the word unsplittable.) (P2 implies P3, but let us state this anyway.) properties that the solution should have. If there is no agreement here, this would be interesting and should be discussed. I would go on to write that the e-mail threads and the properties above strongly imply in my mind the follow property: P4. A split list xs' should have the same relative order as the original list xs. (I am sure everyone finds this obvious, and in the e-mail threads I do not remember an exception, so let us state this clearly.) The next properties are not exactly what I read from the e-mail threads and are really more my personal view, but they are not in contradiction with what I read either. P5. The splitter functions should fit within the spirit of the Data.List module and even the original Haskell 98 List module in terms of type signature and complexity of implementation. P6. There should be at least one splitter function that takes a predicate p, a "by" word such as groupBy or nubBy. If there is agreement about P1 to P6, then this is a way to frame the discussion. If not, then I would say that this needs to be determined first, otherwise the goal stated above will be difficult to reach. The function intercalate was probably easy to add because it grasps a nice idiom, but with splitter functions there is the problem of adding more and more words and moving closer and closer to the power and complexity of Parsec for example. This next property is very much my point of view: P7. The splitter functions should take off where break and scan left off. I add this one because when I was first learning Haskell, I was using the List module quite a bit. I needed a splitBy word, and I thought that break and scan were for what I was looking, but they were not. So I wrote
splitBy :: (a -> Bool) -> [a] -> [([a], [a])] splitBy p xs = let (hd1, tl1) = break p xs (hd2, tl2) = span p tl1 in (hd1, hd2):(if null tl2 then [] else splitBy p tl2)
which I eventually came to not like. To sum up, if we can agree that a set of splitter properties should fit within the context of Properties P1 to P7, then maybe forward movement can be made. If we cannot agree on these properties or another set, then splitter function idioms are too idiomatic, and a larger set of words are needed, so the investment should be in Data.List.Split. Cheers, - Marcus -- Marcus D. Gabriel, Ph.D. Saint Louis, FRANCE http://www.marcus.gabriel.name mailto:marcus@gabriel.name Tel: +33.3.89.69.05.06 Portable: +33.6.34.56.07.75