
Marcus D. Gabriel wrote:
If I were to write
organizeBy :: ([a] -> Bool) -> [a] -> [([a], [a])]
I quite like your idea, but I think the input predicate "([a] -> Bool)" is too ambitious, although it would nicely unify Brent's data Delimiter a where DelimEltPred :: (a -> Bool) -> Delimiter a DelimSublist :: Eq a => [a] -> Delimiter a With the predicate ([a] -> Bool) you have to check all inits of your input to detect a delimiter and only if all inits are no delimiter you know the head element is not part of a delimiter and repeat checking the inits of the tail. I think this is too inefficient in general. In order to keep things simple I would vote for a split function that only takes the simple predicate (a -> Bool) and leaves the more complicated splitting to _additional_ functions (following http://haskell.org/haskellwiki/Simple_to_complex) For instance replace :: [a] -> a -> [a] -> [a] could replace a sublist (first argument) with a single element (second argument) This would help in simple cases, but leaves it to the user to choose a suitable delimiter element. But a general splitting on sublists could be implemented via splitBy isNothing (replace (map Just sl) Nothing (map Just l)) (For a fixed sublist sl, "Nothing" is enough to represent the delimiter) For a simple predicate "(a -> Bool)" it remains to discuss the output of splitBy. You've proposed "[([a], [a])]", but for the simpler case "[(a, [a])]" or "[([a], a)]" may do, but in order to capture the full information of lists before and after delimiters, something like "([a], [(a, [a])])" is needed. Since a tuple "(a, [a])" can be viewed as non-empty list of type "[a]", "([a], [(a, [a])])" collapses to a non-empty list of type "[[a]]" with a _tail_ of non-empty element lists. Therefore I propose the following splitBy as a "work horse". splitBy :: (a -> Bool) -> [a] -> [[a]] The implementation is simply using "break". The basic property is: concat (splitBy p l) == l (This is almost in the spirit of Data.List, since groupBy or words also produces non-empty element lists.) Getting rid of a final delimiter or all of them can be implemented in O(n) (see below). And finally: wordsBy p = filter (not . null) . dropDelims . splitBy p linesBy p = dropDelims . dropFinalDelim . splitBy p with: words s == wordsBy isSpace s lines s == linesBy (== '\n') s Surely, one might prefer a more type-safe version wrt non-empty lists, but the main idea is to use function composition rather than composing a (non-haskell98) splitter input data type as Brent does in his split package. Cheers Christian import Data.Char (isSpace) splitBy :: (a -> Bool) -> [a] -> [[a]] splitBy p l = let (fr, rt) = break p l in case rt of [] -> [fr] d : tl -> let hd : tll = splitBy p tl in fr : (d : hd) : tll dropFinalDelim :: [[a]] -> [[a]] dropFinalDelim ll@(l : ls) = if null ls then if null l then [] else ll else if null (tail (last ls)) then init ll else ll dropDelims :: [[a]] -> [[a]] dropDelims ll = case ll of [] -> [] l : ls -> l : map tail ls wordsBy :: (a -> Bool) -> [a] -> [[a]] wordsBy p = filter (not . null) . dropDelims . splitBy p linesBy :: (a -> Bool) -> [a] -> [[a]] linesBy p = dropDelims . dropFinalDelim . splitBy p prop_wordsBy :: String -> Bool prop_wordsBy s = words s == wordsBy isSpace s prop_linesBy :: String -> Bool prop_linesBy s = lines s == linesBy (== '\n') s replace :: Eq a => [a] -> a -> [a] -> [a] replace sl@(_ : _) r l = case l of [] -> l x : xs -> case stripPrefix sl l of Nothing -> x : replace sl r xs Just rt -> r : replace sl r rt subListSplit :: Eq a => [a] -> [a] -> [[Maybe a]] subListSplit sl@(_ : _) l = splitBy isNothing (replace (map Just sl) Nothing (map Just l)) unintercalate :: Eq a => [a] -> [a] -> [[a]] unintercalate sl@(_ : _) = map (map fromJust) . dropDelims . subListSplit sl