Re: [Haskell-cafe] Re: What is a DSL?

12 Oct 2009

      This problem of dynamically transforming grammars and bulding parsers  
out of it is addressed in:

@inproceedings{1411296,
  author = {Viera, Marcos and Swierstra, S. Doaitse and Lempsink,  
Eelco},
  title = {Haskell, do you read me?: constructing and composing  
efficient top-down parsers at runtime},
  booktitle = {Haskell '08: Proceedings of the first ACM SIGPLAN  
symposium on Haskell},
  year = {2008},
  isbn = {978-1-60558-064-7},
  pages = {63--74},
  location = {Victoria, BC, Canada},
  doi = {http://doi.acm.org/10.1145/1411286.1411296},
  publisher = {ACM},
  address = {New York, NY, USA},
  }

and the code can be found on hackage under the name ChristmasTree
The left-factorisation is explained in a paper we presented at the  
last LDTA and which will appear in ENTCS. Since we have signed some  
copyright form I do notthink I can attach it here, but if you send me  
a mail, I can definitely send you the paper.

Doaitse

On 11 okt 2009, at 21:54, Ben Franksen wrote:
...
Ben Franksen wrote:
...
Next thing I'll try is to transform such a grammar into an actual
parser...
Which I also managed to get working. However, this exposed yet another
problem I am not sure how to solve.
The problem manifests itself with non-left-factored rules like
Number ::= Digit Number | Digit
Translating such a grammar directly into a Parsec parser leads to  
parse
errors because Parsec's choice operator is predictive: if a  
production has
consumed any input the whole choice fails, even if alternative  
productions
would not:
*Main> P.parseTest (parseGrammar myGrm) "2"
parse error at (line 1, column 2):
unexpected end of input
expecting Number
Of course, one solution is to apply Parsec's try combinator to all  
choices
in a rule. But this rather defeats the purpose of using a (by default)
predictive parser in the first place which is to avoid unnecessary
backtracking.
So, a better solution is to left-factor the grammar before  
translating to
Parsec. Since we have a data representation of the grammar that we can
readily analyse and transform, this should be possible given some  
suitable
algorithm. But how is this transformation to be typed?
My first naive attempt was to define (recap: nt :: * -> * is the  
type of
nonterminals, t :: * is the type of terminals a.k.a. tokens, and a  
is the
result type):
...
leftFactor :: Grammar nt t a -> Grammar nt t a
Of course, this is wrong:  Left-factoring is expected to introduce new
nonterminals, so on the right-hand side of the type we should not  
have the
same 'nt' as on the left. Instead we shoudl have some other type that
is "'nt' extended with new constructors". Moreover, we cannot  
statically
know how many new nonterminals are added, so we cannot simply apply  
a type
function to nt. Is this solvable at all in Haskell or do I need proper
dependent types to express this?
I have very vague ideas that revolve around setting up some  
recursive type
function that on each level adds one constructor, define a common  
interface
with a (multiparam) type class and then use existential  
quantification in
the result type to hide the resulting type of nonterminals.
The next question is: Even if this turns out to be possible, isn't it
overkill? Maybe it is better to use an infinite type for the  
nonterminals
in the first place and let the grammar be a partial function? OTOH,  
the
formulation of the grammar as a function that pattern matches on the
nonterminals is very elegant.
Cheers
Ben
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re: What *is* a DSL?

S. Doaitse Swierstra

Re: [Haskell-cafe] Re: What is a DSL?