
Hi, Is there a not-so-trivial parser implementation with Parselib? Parser for a "C" like language would be good. I searched and found Haskell++ -> http://www.cs.chalmers.se/~rjmh/Software/h++.html However, I'd prefer to look at a parser for a "C" like language. -- Regards, Kashyap

Hello For non-trivial parsing Parsec or UU-Parse are much better candidates. If you have Parsec installed from Hackage, I'd still recommend you get the manual and source distribution from: http://legacy.cs.uu.nl/daan/parsec.html The source distribution has some examples - Tiger, Mondrian, Henk - full, if small languages. C is quite a large language and its grammar is usually presented for LR parsing so you are unlikely to find a parser for C or even a subset of C with a combinator library, as parser combinators are LL. To convert LR to LL needs a lot of left factoring and wouldn't be fun, though I believe there is a C parser for the ANTLR system which is LL(k). Best wishes Stephen

Thanks Stephan,
In Haskell, what would be the right thing to parse "C" like languages.
Parsec literature seems to indicate that they can pretty much parse
anything.
The reason I had asked for a sample in Parselib was for me to understand the
monadic parser in action. The last time I tried looking at Parsec from RWH,
I could not follow it too well.
Regards,
Kashyap
On Tue, Jun 1, 2010 at 7:59 PM, Stephen Tetley
Hello
For non-trivial parsing Parsec or UU-Parse are much better candidates.
If you have Parsec installed from Hackage, I'd still recommend you get the manual and source distribution from:
http://legacy.cs.uu.nl/daan/parsec.html
The source distribution has some examples - Tiger, Mondrian, Henk - full, if small languages. C is quite a large language and its grammar is usually presented for LR parsing so you are unlikely to find a parser for C or even a subset of C with a combinator library, as parser combinators are LL. To convert LR to LL needs a lot of left factoring and wouldn't be fun, though I believe there is a C parser for the ANTLR system which is LL(k).
Best wishes
Stephen _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Regards, Kashyap

Hi Kashyap There's a C parser for Happy (LR) - I long while ago I converted this to Frown (also LR) - both Happy and Frown are parser generators that take a grammar description and generate a Haskell module that implements the Parser. Personally I prefer Frown, I find the input syntax a bit nicer than Happy but unfortunately Frown has disappeared since its author Ralf Hinze moved to Oxford. As I said, the C grammar is generally presented in LR form (e.g. in the K&R book, in Harbison & Steele and in the ISO spec). Other C like languages e.g. the GLSL shading language tend to be in LR form as well - I've an untested Happy parser for GLSL in my source repository. Using an LR grammar with Parsec or ParseLib would be fatal - your program would go into an infinite loop on all but the most trivial input. You would have to rewrite the grammar into LL form - Parsec (and ParseLib) have some helpers to do this, and quite often you can write a Parsec grammar quite close to the LR one by using the - many, many1 - combinators for repetition, plus sometimes the - chain - combinators. Unfortunately its still quite a bit of effort to convert to LL - it took me about a days work to write a Parsec parser for GHC-core from the LR Happy grammar and GHC-core is much simpler than GLSL or C. Also the Parsec parser was much slower. Parsec is very powerful - it can handle large lookahead - other LL parsers tend to be LL(1) where 1 is one token lookahead. But while Parsec has the power, lookahead is costly and large lookahead will make the parser slow and use a lot of memory. Parsec can handle context-sensitive parsing - this is often very useful when writing parsers where you are writing a parser for an ad-hoc file format rather than a grammar. You can often do some tricks with context-sensitive parsing that would take a lot of extra work to do with context-free parsing. Finally, its commonly used as a scanner-less parser - where you can use all the features to parse at the character level rather than delegate to a separate scanner. I've thought about writing an article for The Monad Reader - moving from Graham Hutton's parsers to Parsec, if there's any interest I'll look into doing it. For the time being the main difference is probably that Parsec parsers generally use the TokenParser module for some of the combinators that Parselib provides directly. Using the TokenParser module requires a couple of tricks - import it qualified and re-define unqualified versions of the parsers you need - int, symbol, identifier - etc. Best wishes Stephen

I've thought about writing an article for The Monad Reader - moving from Graham Hutton's parsers to Parsec, if there's any interest I'll look into doing it. For the time being the main difference is probably that Parsec parsers generally use the TokenParser module for some of the combinators that Parselib provides directly. Using the TokenParser module requires a couple of tricks - import it qualified and re-define unqualified versions of the parsers you need - int, symbol, identifier - etc.
I'd like a comparison of how to do similar things in Parsec and uu-parsinglib and polyparse and whatever other parser combinator library you want to throw in. I imagine a sort of cookbook, so that once you have become comfortable with one, you can reference this cookbook to figure out the similarities/differences with another. Also, here's something to add to the thread: I wrote a wrapper module for uu-parsinglib for the functional programming course at Utrecht. The goal was threefold: 1. Support the nice functionality of uu-parsinglib (e.g. error handling, efficiency) while simplifying the interface for beginner programmers, 2. Provide an interface very similar to Parselib which was covered in the course, and 3. Add documentation which was sorely missing from uu-parsinglib. The file (licensed to the public domain) is attached. Regards, Sean

On Jun 2, 2010, at 6:45 AM, Stephen Tetley wrote:
There's a C parser for Happy (LR) - I long while ago I converted this to Frown (also LR) - both Happy and Frown are parser generators that take a grammar description and generate a Haskell module that implements the Parser. Personally I prefer Frown, I find the input syntax a bit nicer than Happy but unfortunately Frown has disappeared since its author Ralf Hinze moved to Oxford.
It hasn't _entirely_ disappeared. https://launchpad.net/ubuntu/hardy/+source/frown/0.6.1-6

Thanks Stephen,
I'll look at the Parsec documentation ... I'd certainly be interested in the
article. But I thought Parsec was conceptually same as Parselib ... is that
not the case?
Regards,
Kashyap
On Wed, Jun 2, 2010 at 7:12 AM, Richard O'Keefe
On Jun 2, 2010, at 6:45 AM, Stephen Tetley wrote:
There's a C parser for Happy (LR) - I long while ago I converted this to Frown (also LR) - both Happy and Frown are parser generators that take a grammar description and generate a Haskell module that implements the Parser. Personally I prefer Frown, I find the input syntax a bit nicer than Happy but unfortunately Frown has disappeared since its author Ralf Hinze moved to Oxford.
It hasn't _entirely_ disappeared.
-- Regards, Kashyap

Hi Kashyap They are very close - Parsec has most of the parsers in ParseLib in either the Parsec.Char or Parsec.Combinator modules, if you import the Parsec top level, you will get them, e.g:
import Text.ParserCombinators.Parsec
ParseLib has - ident, nat, int - which have analogues in Parsec but are in the Parsec.Token module and need the qualified import trick:
import Text.ParserCombinators.Parsec.Language import qualified Text.ParserCombinators.Parsec.Token as P
myLex :: P.TokenParser st myLex = P.makeTokenParser emptyDef
integer :: Parser Integer integer = P.integer myLex
ident :: Parser String ident = P.identifier myLex
Handling trailing white-space is probably another difference that I haven't looked at yet (the variations - identifier, integer, natural - in ParseLib). Type signatures are slightly different, of course, as Parsec has more powerful "internal machinery". Best wishes Stephen

If you want to use the easier long-standing libraries from Utrecht, we can provide you with a parser for full Haskell, which you can find in the Utrecht Haskell Compiler (UHC) distribution. In 2002 Alexey Rodriguez produced a C fron-end, using the UUlibs combinators. I am attaching the file with the parser so you can take a look. If you want to have access to the full compiler, which was not maintained, I can make the full code available on a website. I think it is also instructive to start with looking at simpler parsers, e.g. for the bibtex format, which is available from: https://subversion.cs.uu.nl/repos/project.STEC.uulib/uulib/trunk/examples/ Doaitse On 1 jun 2010, at 13:06, C K Kashyap wrote:
Hi, Is there a not-so-trivial parser implementation with Parselib? Parser for a "C" like language would be good. I searched and found Haskell++ -> http://www.cs.chalmers.se/~rjmh/Software/h++.html However, I'd prefer to look at a parser for a "C" like language.
-- Regards, Kashyap _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hey Doaitse,
Could you please post the full code available?
On Thu, Jun 3, 2010 at 1:01 AM, S. Doaitse Swierstra
If you want to use the easier long-standing libraries from Utrecht, we can provide you with a parser for full Haskell, which you can find in the Utrecht Haskell Compiler (UHC) distribution.
In 2002 Alexey Rodriguez produced a C fron-end, using the UUlibs combinators. I am attaching the file with the parser so you can take a look. If you want to have access to the full compiler, which was not maintained, I can make the full code available on a website. I think it is also instructive to start with looking at simpler parsers, e.g. for the bibtex format, which is available from:
https://subversion.cs.uu.nl/repos/project.STEC.uulib/uulib/trunk/examples/
Doaitse
On 1 jun 2010, at 13:06, C K Kashyap wrote:
Hi, Is there a not-so-trivial parser implementation with Parselib? Parser for a "C" like language would be good. I searched and found Haskell++ -> http://www.cs.chalmers.se/~rjmh/Software/h++.htmlhttp://www.cs.chalmers.se/%7Erjmh/Software/h++.html However, I'd prefer to look at a parser for a "C" like language.
-- Regards, Kashyap _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Regards, Kashyap
participants (5)
-
C K Kashyap
-
Richard O'Keefe
-
S. Doaitse Swierstra
-
Sean Leather
-
Stephen Tetley