User-defined operators and compound expressions using Happy

Hi there folks, once again, I've got a question related to Happy (I've got version 1.13 at the moment). Maybe, it's even more a question on formal languages, but well... How can I write a grammar that can cope with user-defined operators (of different precedences/associativities) and compound expression like function calls, if-then-else- and case-statements and the like. I tried to write it down straight forwardly, but failed terribly (alas, I didn't keep it, so I can't show you - if someone of you is versed in this issue, I can try to explain the language's constructs). Thank you so much, Frank-Andre Riess

On Mon, 2004-11-22 at 17:48 +0100, Frank-Andre Riess wrote:
Hi there folks,
once again, I've got a question related to Happy (I've got version 1.13 at the moment). Maybe, it's even more a question on formal languages, but well... How can I write a grammar that can cope with user-defined operators (of different precedences/associativities
One standard solution is to parse user defined operators as if they were all one precedence/associativity and then re-associate them later once you know what the precedence and associativity of each operator is. That way the parser grammar does not need to be adjusted on the fly. So you wold parse 1+2*3 as [LiteralInt 1, Op '+', LiteralInt 2, Op '*', LiteralInt 3] and then later turn that into BinOp '+' (LiteralInt 1) (BinOp '*' (LiteralInt 2) (LiteralInt 3)) using your mapping of operators to precedence/associativity. Duncan

On Mon, 2004-11-22 at 17:48 +0100, Frank-Andre Riess wrote:
Hi there folks,
once again, I've got a question related to Happy (I've got version 1.13 at the moment). Maybe, it's even more a question on formal languages, but well... How can I write a grammar that can cope with user-defined operators (of different precedences/associativities
One standard solution is to parse user defined operators as if they were all one precedence/associativity and then re-associate them later once you know what the precedence and associativity of each operator is.
That way the parser grammar does not need to be adjusted on the fly.
So you wold parse 1+2*3 as [LiteralInt 1, Op '+', LiteralInt 2, Op '*', LiteralInt 3] and then later turn that into BinOp '+' (LiteralInt 1) (BinOp '*' (LiteralInt 2) (LiteralInt 3)) using your mapping of operators to precedence/associativity.
Duncan
Thank you very much. What I did by now is more or less along the lines of your suggestion, but it doesn't work to my satisfication yet. Might be due to overloaded tokens, though (e.g. '|' is both a set union operator and the seperator in case statements). Maybe we'll need to redesign the syntax.

On 2004 nov 22, at 17:48, Frank-Andre Riess wrote:
Hi there folks,
once again, I've got a question related to Happy (I've got version 1.13 at the moment). Maybe, it's even more a question on formal languages, but well... How can I write a grammar that can cope with user-defined operators (of different precedences/associativities) and compound expression like function calls, if-then-else- and case-statements and the like. I tried to write it down straight forwardly, but failed terribly (alas, I didn't keep it, so I can't show you - if someone of you is versed in this issue, I can try to explain the language's constructs).
One way of doing this using combinator based parsing (where you can generate parsers dynamically) is to read the fixity declarations, and to use the result of this to build the precedence parser. This idea has been sketched in: S. D. Swierstra and P. R. Azero Alcocer. Fast, Error Correcting Parser Combinators: a Short Tutorial. In J. Pavelka, G. Tel, and M. Bartosek, editors, SOFSEM'99 Theory and Practice of Informatics, 26th Seminar on Current Trends in Theory and Practice of Informatics, volume 1725 of LNCS, pages 111--129, November 1999. If you do not have access to this I will be happy to send it to you, Doaitse Swierstra == some text (created from the pdf) from this paper ================ As an example of what can be done we will now show how to construct parsers dynamically by writing a parser for an expression language with infix operators. An example input is: (L+R*)a+b*(c+d) and the code we want to generate is: abcd+*+ which is the reversed Polish notation of the input expressions. The text (L+R*) indicates that + is left (L) associative and has lower priority than *, which is right (R) associative. In this way an unlimited number of operators may be specified, with relative priorities depending on their position in this list. We start by defining a function that parses a single character identifier and returns as its result that identifier in the form of a string: pVar = (\c -> [c]) <$> pAnyOf ['a'..'z'] . The next step is to define a function that, given the name of an operator, recognizes that operator and as a result returns a function that will concatenate the two arguments of that operator and postfix it with the name of the operator, thus building the reversed Polish notation: pOp name = (\ left right -> left++right++[name]) <$ pSym name Note that, by using the operator <$ we indicate that we are not interested in the recognized operator; we already know what this is since it was passed as a parameter. Next we de ne the function compile. For this we introduce a new combinator <@>, that takes as its left hand side operand a parser constructor f and as its right hand side operand a parser g. The results v of parsing a pre x of the input with g, are used in calling f; these calls, in their turn, result in new parsers which are applied to the rest of the input: (f <@> g) input = [ f v rest | (v, rest) <- g input ] Since our input consists of two parts, the priority declarations and the expression itself , we postulate that the function compile reads: compile = pRoot <@> pPrios First we focus on the function pRoot, that should take as argument the result of recognizing the priorities. Here we will assume that this result is a function that, given how to parse an operand, parses an expression constructed out of operands and the de ned operators: pRoot prios = let pExpr = prios (pVar <|> pParens pExpr) in pExpr There is a difference between an operator that occurs in the declaration part of the input and one in the expression part: the former may be any operator, whereas the latter can only be an operator that has been declared before. For the priority declaration part we thus introduce a new parser that recognizes any operator, and returns a parser that compiles the just recognized operator using the function pOp defined before: pAnyOp = pOp <$> pAnyOf "+*/-^" just some possible operators Now suppose we have recognized a left and a right associative operator resulting in operator compilers pLeft and pRight. Out of these we can construct a function that, given the operand parser, parses infix expressions containing pLeft and pRight occurrences: pLR factor = (pChainl pLeft . pChainr pRight) factor. Generalizing this pattern to an unlimited number of operators we now deduce the definition: pPrios = pParens $ pFoldr ((.), id) (( pChainl <$ pSym 'L' <|> pChainr <$ pSym 'R' ) <*> pAnyOp ) Let us now compare once more this approach with the situation where we would have used a special parser generator. In the combinator approach we can freely introduce all kinds of abbreviations by defining new combinators in terms of existing ones; furthermore we may de ne higher order combinators that take arguments and return values that may be parsers. This is a property we get for free here, and is absent in most tools, where the syntax of the input is xed and at most some form of macro processing is available as an abstraction mechanism. Another important consequence from embedding our parser construction in an existing language is that type checking and error reporting can directly be done at the program level, and not at the level of some generated program. ======================================================= Fast, Error Correcting Parser Combinators: A Short Tutorial 5 module ExtendedCombinators where import BasicCombinators infixl 4 <$>, <$, <*, *>, <**>, <??> infixl 2 `opt` 5 pAnyOf :: Eq s => [s] -> Parser s s opt :: Eq s => Parser s a -> a -> Parser s a (<$>) :: Eq s => (b -> a) -> Parser s b -> Parser s a (<$ ) :: Eq s => a -> Parser s b -> Parser s a 10 (<* ) :: Eq s => Parser s a -> Parser s b -> Parser s a ( *>) :: Eq s => Parser s a -> Parser s b -> Parser s b (<**>) :: Eq s => Parser s b -> Parser s (b->a) -> Parser s a (<??>) :: Eq s => Parser s b -> Parser s (b->b) -> Parser s b 15 pAnyOf = foldr (<|>) pFail . map pSym p `opt` v = p <|> pSucceed v f <$> p = pSucceed f <*> p f <$ p = const f <$> p p <* q = (\ x _ -> x) <$> p <*> q 20 p *> q = (\ _ x -> x) <$> p <*> q p <**> q = (\ x f -> f x) <$> p <*> q p ?> q = p <**> (q `opt` id) pFoldr alg@(op,e) p 25 = pfm where pfm = (op <$> p <*> pfm) `opt` e pFoldrSep alg@(op,e) sep p = (op <$> p <*> pFoldr alg (sep *> p)) `opt` e pFoldrPrefixed alg@(op,e) c p = pFoldr alg (c *> p) 30 pList p = pFoldr ((:), []) p pListSep s p = pFoldrSep ((:), []) s p pListPrefixed c p = pFoldrPrefixed ((:), []) c p pSome p = (:) <$> p <*> pList p 35 pChainr op x = r where r = x <**> (flip <$> op <*> r `opt` id) pChainl op x = f <$> x <*> pList (flip <$> op <*> x) where f x [] = x f x (func:rest) = f (func x) rest 40 pPacked l r x = l *> x <* r some ad hoc extensions pOParen = pSym '(' 45 pCParen = pSym ')' pParens = pPacked pOParen pCParen Listing 3: ExtendedCombinators
Thank you so much, Frank-Andre Riess
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 2004 nov 22, at 17:48, Frank-Andre Riess wrote:
Hi there folks,
once again, I've got a question related to Happy (I've got version 1.13 at the moment). Maybe, it's even more a question on formal languages, but well... How can I write a grammar that can cope with user-defined operators (of different precedences/associativities) and compound expression like function calls, if-then-else- and case-statements and the like. I tried to write it down straight forwardly, but failed terribly (alas, I didn't keep it, so I can't show you - if someone of you is versed in this issue, I can try to explain the language's constructs).
One way of doing this using combinator based parsing (where you can generate parsers dynamically) is to read the fixity declarations, and to use the result of this to build the precedence parser. This idea has been sketched in:
S. D. Swierstra and P. R. Azero Alcocer. Fast, Error Correcting Parser Combinators: a Short Tutorial. In J. Pavelka, G. Tel, and M. Bartosek, editors, SOFSEM'99 Theory and Practice of Informatics, 26th Seminar on Current Trends in Theory and Practice of Informatics, volume 1725 of LNCS, pages 111--129, November 1999.
If you do not have access to this I will be happy to send it to you,
Doaitse Swierstra
Thank you very much. I don't know much about combinator based parsing, however (or maybe just therefor) I'm interested in having a closer look at it. I'd be obliged if you sent me your paper.
participants (3)
-
Doaitse Swierstra
-
Duncan Coutts
-
Frank-Andre Riess