
Stephane Bortzmeyer
On Tue, Sep 05, 2006 at 04:17:41PM +0200, Stephane Bortzmeyer
wrote a message of 25 lines which said: I'm trying to use Parsec for a language which have identifiers where the '-' character is allowed only inside identifiers, not at the start or the end.
I'm not really familiar with Parsec (I wrote my own limited backtrack parser years ago, and haven't quite got round to updating my brain), and while (judging by threads like this one) it seems to be harder to use than one would hope, this particular problem doesn't look as hard to me as all that.
[My grammar was underspecified, I also want to disallow two consecutive dashes.]
[...]
Here is my final version (rewritten in my style, errors are mine and not Udo's), thanks again:
inner_minus = do char '-' lookAhead alphaNum return '-'
identifier = do start <- letter rest <- many (alphaNum <|> try inner_minus) return (start:rest) > "identifier"
I'd have thought something like the following was the 'obvious' way of doing it: chThen c r = do a <- c; as <- r; return (a:as) identifier = do start <- letter `chThen` many alphaNum; rest <- many (char '-' `chThen` many1 alphaNum) return (start++concat rest) > "identifier" ie, your identifiers are an initial sequence of non-minuses beginning with a letter, and then an optional sequence of non-minuses preceded by a minus. Or have I lost the plot somewhere? Aside: Is there already name for `chThen`? ie (liftM2 (:)); I had a feeling we were avoiding liftM & friends for some reason. -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html (updated 2006-07-14)