On 22 okt 2009, at 15:56, Robert Atkey wrote:

....

Previously parsed input /can/ determine what the parser will accept in
the future (as pointed out by Peter Ljunglöf in his licentiate thesis).
Consider the following grammar for the context-sensitive language
{aⁿbⁿcⁿ| n ∈ ℕ}:

Yes, sorry, I was sloppy in what I said there. Do you know of a
characterisation of what languages having a possibly infinite amount of
nonterminals gives you. Is it all context-sensitive languages or a
subset?

The answer is: all context-sensitive languages. This is a very old insight which has come back in various forms in computer science. The earliest conception in CS terms is the concept of an affix-grammar, in which the infinite number of nonterminals is generated by parameterising non-terminals by trees. They were invented by Kees koster and Lambert Meertens (who applied them to generate music: http://en.wikipedia.org/wiki/index.html?curid=5314967) in the beginning of the sixties of the last century. There is a long follow up on this idea, of which the two most well-known versions are the so-called two-level grammars which were used in the Algol68 report and the attribute grammar formalism first described by Knuth. The full Algol68 language is defined in terms of a two-level grammar. Key publications/starting points if you want to learn more about these are:

- the Algol68 report: http://burks.brighton.ac.uk/burks/language/other/a68rr/rrtoc.htm

- the wikipedia paper on affix grammars: http://en.wikipedia.org/wiki/Affix_grammar

- a nice book about the basics od two-level grammars is the Cleaveland & Uzgalis book, "Grammars for programming languages", which may be hard to get,

but there is hope: http://www.amazon.com/Grammars-Programming-Languages-languages/dp/0444001875

- http://www.agfl.cs.ru.nl/papers/agpl.ps

- http://comjnl.oxfordjournals.org/cgi/content/abstract/32/1/36

Doaitse Swierstra

And a general definition for parsing single-digit numbers. This works
for any set of non-terminals, so it is a reusable component that works
for any grammar:

Things become more complicated if the reusable component is defined
using non-terminals which take rules (defined using an arbitrary
non-terminal type) as arguments. Exercise: Define a reusable variant of
the Kleene star, without using grammars of infinite depth.

I see that you have an answer in the paper you linked to above. Another
possible answer is to consider open sets of rules in a grammar:

data OpenRuleSet inp exp =
  forall hidden. OpenRuleSet (forall a. (exp :+: hidden) a ->
                                  Rule (exp :+: hidden :+: inp) a)

data (f :+: g) a = Left2 (f a) | Right2 (g a)

So OpenRuleSet inp exp, exports definitions of the nonterminals in
'exp', imports definitions of nonterminals in 'inp' (and has a
collection of hidden nonterminals).

It is then possible to combine them with a function of type:

combineG :: (inp1 :=> exp1 :+: inp) ->
           (inp2 :=> exp2 :+: inp) ->
           OpenRuleSet inp1 exp1 ->
           OpenRuleSet inp2 exp2 ->
           OpenRuleSet inp (exp1 :+: exp2)

One can then give a reusable Kleene star by stating it as an open rule
set:

star :: forall a nt. Rule nt a -> OpenRuleSet nt (Equal [a])

where Equal is the usual equality GADT.

Obviously, this would be a bit clunky to use in practice, but maybe more
specialised versions combineG could be given.

Bob

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe