
Marco Vassena wrote:
The bigger picture is that I am trying to figure out what are the core constructs needed to define a parser, therefore I want to have a rather abstract interface. In my set of core constructs there are: <$> : (a -> b) -> f a -> f b <*> : f (a -> b) -> f a -> f b <|> : f a -> f a -> f a -- (symmetric choice) pure : a -> f a fail : f a pToken : f Char
Is it possible to define a parser that applies the longest matching rule using these constructs only?
No, these are not sufficient. Longest-matching rule (aka maximal munch) is defined as applying a parser for as long as it succeeds. Therefore, we need a way to determine if a parser succeeded: we need what is called a (monadic) reflection. In simpler words, we need an operation like the soft-cut in Prolog. It can be defined in terms of a primitive called `msplit', described in the LogicT paper: http://okmij.org/ftp/Computation/monads.html#LogicT The following article talks about maximal munch in more detail, in terms of logic programming: http://okmij.org/ftp/kakuritu/logic-programming.html#many