
Hello, I have two questions regarding a Cocke, Younger, Kasami parser. Consider this program: type NT = Char -- Nonterminal type T = Char -- Terminal -- a Chomsky production has either two nonterminals or one terminal on its right-hand side type ChomskyProd = (NT, Either T (NT, NT)) -- a grammar consists of a startsymbol, nonterminal symbols, terminal symbols and productions type Grammar = (NT, [NT], [T], [ChomskyProd]) parse::Grammar->[T]->Bool parse (s, nts, ts, prods) w = s `elem` gs n 1 where n = length w table = [[gs i j|j<-[1..n-i+1]]|i<-[1..n]] gs 1 j = [nt|p<-prods,termProd p, let (nt, Left t)=p, w!!(j-1)==t] gs i j = [nt|k<-[1..i-1],p<-prods, not (termProd p), let (nt, Right (a, b))=p, a `elem` table!!(k-1)!!(j-1), b `elem` table!!(i-k-1)!!(j+k-1)] The sets gs i j contain all nonterminal symbols from which the substring of w starting at index j of length i can be derived. Please have a look at the last line of the algorithm. In my first attempt I just referred to gs k j and gs (i-k) (j+k) what looks a lot more intuitive. However I noted that this way the sets gs i j are evaluated multiple times. Is there a better and more readable way to prevent multiple evaluation of these sets? The second question regards lazy evaluation. Consider the stupid grammar S->S S S->A A A->a that generates a^(2n). The performance of the algorithm drops very fast even for small n, probably because the gs i j are getting very large. Is there a trick to get lazy evaluation into play here? It is sufficient to find only one occurence of the start symbol in gs n 1. Best regards, Steffen