
Ketil Malde wrote:
On Sun, 2007-07-01 at 12:45 +0200, apfelmus wrote:
Here's an (admittedly crazy) approach
Why is it so crazy? The orthogonality issues with the different ways of breaking up lists (split/break/span/take/drop), and the multitude of possible predicates (either too complicated, or too specific) has always been an annoyance to me. I thought your solution was quite nice!
One problem is that you have to use drop (first 2) instead of drop 2 now. This can be remedied with some type-class hackery. Another problem is that performance will suffer a bit with the general approach. So, the specialized versions are likely to be kept around anyway. (Type classes can help with specialization, too.) Other than that, the general approach to drop & friends is not so crazy. But I don't like my implementation, so let's build a better one: One problem of the implementation is that I think it doesn't handle nicely the different semantics of dropPrefix compared to drop or dropWhile : whereas the latter don't fail on a premature end of the list, the dropPrefix version should fail. (The question whether it should fail with an error or with Nothing can be delegated by providing different variants of drop). The solution comes automatically when pondering what a Dropper really is: it's a *parser*. In other words, drop & friends are just functions that parse the beginning of a string and return how much has been parsed. Put differently, their feature is to ignore the "AST" resulting from a parse. type Dropper a = Parser a () -- token type a, result type () Here, I don't mean the usual (s -> (a,s)) parsers, but an implementation that fits the stream-like nature of our dropper: either a determinstic data Parser c r = Get (c -> Dropper c r) | Result r | Fail or a non-deterministic parser data Parser c a = Get (c -> Dropper c r) | Result r (Dropper c r) | Fail The latter are, of course, Koen Classen's parallel parsing processes (http://www.cs.chalmers.se/~koen/pubs/jfp04-parser.ps). Now, which ones to choose? With deterministic parsers, we loose the normal behavior of drop and dropWhile to accept lists that are too small. Thus, we choose non-deterministic parsers and implement drop with a "maximum munch" behavior -- drop as much as we can parse, but not more drop :: Dropper a -> [a] -> [a] drop p xs = case drop' p xs of Nothing -> error "drop: parse failed" Just xs -> xs where drop' Fail _ = Nothing drop' (Result _ p) xs = drop' p xs `mplus` Just xs drop' (Get f) (x:xs) = drop' (f x) xs drop' (Get _) [] = Nothing Here, the second equation of drop tries to drop more but jumps back via Maybe's `mplus` if that fails. With the usual Monad and MonadPlus instances for Parser c a, we can now write -- take while the condition is satisfied while :: (a -> Bool) -> Dropper a while = many' . satisfy where many' p = return () `mplus` p >> many' -- accept the first n characters or less first :: Int -> Dropper a first 0 = return () first n = return () `mplus` (get >> first (n-1)) -- parse a given String prefix :: Eq a => [a] -> Dropper a prefix [] = eaten prefix (x:xs) = get >>= \c -> if c == x then prefix xs else mzero By returning successes early, while and first accept an unexpected end of input. An alternative version of first that complains when not enough characters are available to drop would be exactly :: Int -> Dropper a exactly 0 = return () exactly n = get >> exactly (n-1) or exactly n = sequence_ (replicate n get) Regards, apfelmus