
On Tue, Jan 24, 2012 at 6:54 AM, Christopher Brown
Hi Everyone,
Thanks for everyone's kind responses: very helpful so far!
I fully appreciate and understand how difficult writing a C++ parser is. However I may need one for our new Paraphrase project, where I may be targeting C++ for writing a refactoring tool. Obviously I don't want to start writing one myself, hence I was asking if anyone new about an already existing implementation.
Rose looks interesting, I'll check that out, thanks!
I did some more digging after sending my email. I didn't learn about GLR parser when I was in school, but that seems to be what the cool compilers use these days. Then I discovered that Happy supports GLR, that is happy! Next I found that GLR supposedly makes C++ parsing much easier than LALR, "The reason I wrote Elkhound is to be able to write a C++ parser. The parser is called Elsa, and is included in the distribution below." The elsa documentation should give you a flavor for what needs to be done when making sense of C++: http://scottmcpeak.com/elkhound/sources/elsa/index.html NB: I don't think it's been seriously worked on since 2005 so I assume it doesn't match the latest C++ spec. The grammar that elsa parses is here, one warning is that it doesn't reject all invalid programs (eg., it errs on the side of accepting too much): http://scottmcpeak.com/elkhound/sources/elsa/cc.gr I think the path of least resistance is pure rose without the haskell support. Having said that, I think the most fun direction would be converting the elsa grammar to happy. It's just that you'll have a lot of work (read: testing, debugging, performance tuning, and then adding vendor features) to do. One side benefit is that you'll know much more about the intricacies of C++ when you're done than if you use someone else's parser. Jason