
On Wednesday 29 September 2010 19:10:22, Ben Franksen wrote:
Note the last line mentions only '}'. I would rather like to see
expecting "}" or digit
since the parser could very well accept another digit here.
parsec2 did that, I don't know whether that change is intentional or accidental.
(1) What is the reason for this behaviour? (2) Is there another combinator that behaves as I would like? (3) Otherwise, how do I write one myself?
I just saw that Christian Maeder answered a similar question recently. I
tried his suggestion of using manyTill and bingo:
{-# LANGUAGE NoMonomorphismRestriction #-} import Control.Applicative ((*>),(<*)) import Text.Parsec block p = char '{' *> p <* char '}' parser = block (manyTill digit (char '}')) main = parseTest parser "{123a}"
gives
parse error at (line 1, column 5): unexpected "a" expecting "}" or digit
So far so good. I wonder whether this parser is as efficient as the original one.
manyTill p end = scan where scan = do{ end; return [] } <|> do{ x <- p; xs <- scan; return (x:xs) } I'm not sure, but I suspect it's less efficient. Perhaps manyTill' p end = scan [] where scan acc = do { end; return (reverse acc) } <|> do { x <- p; scan (x:acc) } is more efficient (depends on Parsec's bind which is more efficient), you could test.
Also, this style is less modular, as I have to mention the terminator in two places.
That's not the main problem. `manyTill' consumes the ending token, so block (manyTill whatever (char '}')) needs two '}' to succeed. You would need block (manyTill digit (lookAhead (char '}')) to replicate the behaviour of block (many digit).
Is there a non-greedy variant of 'many' so that modularity gets restored and efficiency is not lost?
Cheers Ben