Associative prefix operators in Parsec

Consider a simple language of logical expressions:
import Control.Applicative import Text.Parsec hiding ((<|>), many) import Text.Parsec.String import Text.Parsec.Expr
data Expr = Truth | Falsity | And Expr Expr | Or Expr Expr | Not Expr deriving (Show, Eq)
I define a simple expression parser using Parsec:
expr :: Parser Expr expr = buildExpressionParser table (lexeme term) > "expression"
term :: Parser Expr term = between (lexeme (char '(')) (lexeme (char ')')) expr <|> bool > "simple expression"
bool :: Parser Expr bool = lexeme (string "true" *> pure Truth) <|> lexeme (string "false" *> pure Falsity)
lexeme :: Parser a -> Parser a lexeme p = p <* spaces
table = [ [prefix "!" Not ] , [binary "&" And AssocLeft ] , [binary "|" Or AssocLeft ] ]
binary name fun assoc = Infix (do{ lexeme (string name); return fun }) assoc prefix name fun = Prefix (do{ lexeme (string name); return fun })
Now this doesn't work:
test1 = parseTest expr "!!true"
But this does:
test2 = parseTest expr "!(!true)"
I have studied the code for buildExpressionParser, and I know why this happens (prefix operators are treated as nonassociative), but it seems like one would often want right-associative prefix operators (so test1 would work). Is there a common workaround or solution for this problem? I assume the nonassociativity in Parsec is by design and not a bug. -- \ Troels /\ Henriksen

The simplest solution is to parse the prefixes yourself and do not put it into the table. (Doing the infixes "&" and "|" by hand is no big deal, too, and possibly easier then figuring out the capabilities of buildExpressionParser) Cheers C. Am 07.03.2012 13:08, schrieb Troels Henriksen:
Consider a simple language of logical expressions:
import Control.Applicative import Text.Parsec hiding ((<|>), many) import Text.Parsec.String import Text.Parsec.Expr
data Expr = Truth | Falsity | And Expr Expr | Or Expr Expr | Not Expr deriving (Show, Eq)
I define a simple expression parser using Parsec:
expr :: Parser Expr expr = buildExpressionParser table (lexeme term) > "expression"
term :: Parser Expr term = between (lexeme (char '(')) (lexeme (char ')')) expr <|> bool > "simple expression"
bool :: Parser Expr bool = lexeme (string "true" *> pure Truth) <|> lexeme (string "false" *> pure Falsity)
lexeme :: Parser a -> Parser a lexeme p = p<* spaces
table = [ [prefix "!" Not ] , [binary "&" And AssocLeft ] , [binary "|" Or AssocLeft ] ]
binary name fun assoc = Infix (do{ lexeme (string name); return fun }) assoc prefix name fun = Prefix (do{ lexeme (string name); return fun })
Now this doesn't work:
test1 = parseTest expr "!!true"
But this does:
test2 = parseTest expr "!(!true)"
I have studied the code for buildExpressionParser, and I know why this happens (prefix operators are treated as nonassociative), but it seems like one would often want right-associative prefix operators (so test1 would work). Is there a common workaround or solution for this problem? I assume the nonassociativity in Parsec is by design and not a bug.

Christian Maeder
The simplest solution is to parse the prefixes yourself and do not put it into the table.
(Doing the infixes "&" and "|" by hand is no big deal, too, and possibly easier then figuring out the capabilities of buildExpressionParser)
Is there another solution? My post was a simplified example to showcase the problem; in general I would prefer to use a function to build the expression parser. I could just write my own that does not have this problem, and in fact, I already have, I just wanted to know whether Parsec could be wrangled into shape. -- \ Troels /\ Henriksen

Am 08.03.2012 17:16, schrieb Troels Henriksen:
Christian Maeder
writes: The simplest solution is to parse the prefixes yourself and do not put it into the table.
(Doing the infixes "&" and "|" by hand is no big deal, too, and possibly easier then figuring out the capabilities of buildExpressionParser)
Is there another solution? My post was a simplified example to showcase the problem; in general I would prefer to use a function to build the expression parser. I could just write my own that does not have this problem, and in fact, I already have, I just wanted to know whether Parsec could be wrangled into shape.
Yes, it certainly could do better. The code for prefix and postfix currently looks like: termP = do{ pre <- prefixP ; x <- term ; post <- postfixP ; return (post (pre x)) } This supports (only) one prefix or postfix (or both), where the prefix binds stronger than the postfix (although, they have equal precedence). Problem 1: "- - 5" is not supported Another problem are prefix or postfix operators that bind weaker than infixes, like infix "^" and prefix "-". Problem 2: "1 ^ -2" is rejected, although no other parsing is possible. (The same would apply to a weakly binding postfix operator following the left argument: "4+ ^ 5". This would even need some look ahead to find out, if + is not an infix and ^ a prefix operator) (Haskell features these problems, too.) Maybe the special case of repeated prefixes could be solved by putting in the prefix entry twice into the table for the currentbuildExpressionParser, but considering possibly equal symbols for prefix, postfix or infix symbols seems quite difficult, although only an ambiguity needs to be reported. Cheers C.
participants (2)
-
Christian Maeder
-
Troels Henriksen