
On Sun, Jan 24, 2016 at 10:56:31PM +0100, Simon Jakobi wrote:
if I understood your specification correctly, there would be multiple ways to parse the string "AAA":
- 3 'x' elements ("A", "A", "A") - 2 'x' elements ("AA", "A") - 2 'x' elements again (first one shorter) ("A", "AA") - 1 'x' element ("AAA")
There would be even more ways because 'y', too, can represent one or more 'A's.
[...]
Is this due to attoparsec not being able to "backtrack" (not sure if this is the right term)? Is backtracking something that parsers generally are incapable of?
Ah, indeed you are right. attoparsec, parsec and friends handle failure with `try`. From Attoparsec documentation: Attempt a parse, but do not consume any input if the parse fails. One way to deal with cases like yours is for every parser to compute a "list of successes". Crude example: import Text.Parsec import Text.Parsec.String foo :: Parser [String] foo = anyChar >>= \h -> (foo <|> e) >>= \t -> return ([""] ++ map (h:) t) where e = return [""] -- λ> parseTest foo "bar" -- ["","b","ba","bar"] Then you can chain those with `try`/`choice` and compute your result(s) (I guess using the list monad to handle the mechanism could do). Ambiguous grammars are an age old problem, and some searching [1] leads me to believe there are already viable solution in Haskell. [1] http://stackoverflow.com/questions/13279087/parser-library-that-can-handle-a...