Parsec expressions with alphaNum operators

Hi, I'm using buildExpressionParser, and I'd like to use alphanumeric operator characters. I get an (unexpected "a") error though. With a test string like "-a" if "a" is used in any of the "reservedOpNames". I'm aiming for the Fortran operators like ".and.". The listing below may be helpful. It's taken from the Haskell wiki's "Parsing expressions and statements" article (minus the statement part).I've added an ":a:" operator. The article uses "~" as a unary operator (I'm heading for +/-). It can be tested with: $ parseTest exprparser "~a" -- code begins module Main where import Control.Monad(liftM) import Text.ParserCombinators.Parsec import Text.ParserCombinators.Parsec.Expr import Text.ParserCombinators.Parsec.Token import Text.ParserCombinators.Parsec.Language data Expr = Var String | Con Bool | Uno Unop Expr | Duo Duop Expr Expr deriving Show data Unop = Not deriving Show data Duop = And | Iff deriving Show data Stmt = Nop | String := Expr | If Expr Stmt Stmt | While Expr Stmt | Seq [Stmt] deriving Show def = emptyDef{ commentStart = "{-" , commentEnd = "-}" , identStart = letter , identLetter = alphaNum , opStart = oneOf "~&=:" , opLetter = oneOf "~&=:a" , reservedOpNames = ["~", "&", "=", ":=", ":a:"] , reservedNames = ["true", "false", "nop", "if", "then", "else", "fi", "while", "do", "od"] } TokenParser{ parens = m_parens , identifier = m_identifier , reservedOp = m_reservedOp , reserved = m_reserved , semiSep1 = m_semiSep1 , whiteSpace = m_whiteSpace } = makeTokenParser def exprparser :: Parser Expr exprparser = buildExpressionParser table term > "expression" table = [ [Prefix (m_reservedOp "~" >> return (Uno Not))] , [Infix (m_reservedOp "&" >> return (Duo And)) AssocLeft] , [Infix (m_reservedOp "=" >> return (Duo Iff)) AssocLeft] , [Infix (m_reservedOp ":a:" >> return (Duo Iff)) AssocLeft] ] term = m_parens exprparser <|> liftM Var m_identifier <|> (m_reserved "true" >> return (Con True)) <|> (m_reserved "false" >> return (Con False)) play :: String -> IO () play inp = case parse exprparser "" inp of { Left err -> print err ; Right ans -> print ans } -- code ends Cheers, Paul Keir Research Student University of Glasgow Department of Computing Science pkeir@dcs.gla.ac.uk

Hi,
2008/4/7 Paul Keir
I'm using buildExpressionParser, and I'd like to use alphanumeric operator characters. I get an (unexpected "a") error though. With a test string like "-a" if "a" is used in any of the "reservedOpNames". I'm aiming for the Fortran operators like ".and.".
...
reservedOp checks to make sure that whatever it parses isn't immediately followed by another valid operator character (the idea being, I think, that then whoever wrote the string being parsed probably meant for that other longer operator to be one token). This causes your error, since it can't parse the reservedOp "~", and "~a" itself is not anything. You'll notice that, for example, "~ a" parses just fine. I don't seen any perfect solution. If you want to use the token and expression modules in this way, you'll have to redefine your operators so as not to intersect with your valid identifiers. Alternatively, you can roll your own token parsers (though parsec does what it does for good reasons, so you might see some unexpected behavior). Or you could just mandate that operators must be followed by a space. Good luck! --Chris Casinghino

Thanks Chris, When I looked at the Fortran alphaNum operators (.and. .or. etc.) I had hoped that supplying Parsec's "opStart" with a dot would have been hitting the nail on the head. Oh well. I have noticed something interesting though. If I simply omit the "a" from "opLetter", the problem is gone. In fact, leaving "opStart", "opLetter", and "reservedOpNames" all empty works fine too. Perhaps I should aim to leave aside the "Language" part of Parsec altogether. Paul

On Tue, Apr 8, 2008 at 4:46 AM, Paul Keir
I have noticed something interesting though. If I simply omit the "a" from "opLetter", the problem is gone. In fact, leaving "opStart", "opLetter", and "reservedOpNames" all empty works fine too.
That makes sense. opLetter is there to support "user-defined" operators, as many languages allow two distinct classes of identifiers - one which looks like variables and one which looks like symbolic operators. If you don't want to allow user-defined operators which contain "a", so that ":a:", a reserved keyword, is the only valid identifier which contains both "a" and another member of opLetter, you should be OK leaving "a" off of the list of opLetters. Indeed, you shouldn't actually need to specify ":a:" in reservedOpNames either, since "reservedOp" doesn't check that list, and the "operator" parser will already reject ":a:" because it contains a non-opLetter. Though, for sanity you might leave it there anyway. At least, I think so. --Chris
participants (2)
-
Chris Casinghino
-
Paul Keir