
Hi Caballers, now that .cabal files are getting a more complex syntax (due to configurations), I'd really like to use Parsec instead of lots of hand-written and hard to maintain parsing code. By getting rid of ReadP-based field parsing, we'd also get proper error messages for field values. At the Parsec homepage I says that it's now included by default in GHC, Hugs and NHC, so it'll most likely be present on any sufficiently recent system. What would be arguments against this proposal? / Thomas

Thomas Schilling wrote:
Hi Caballers,
now that .cabal files are getting a more complex syntax (due to configurations), I'd really like to use Parsec instead of lots of hand-written and hard to maintain parsing code. By getting rid of ReadP-based field parsing, we'd also get proper error messages for field values.
At the Parsec homepage I says that it's now included by default in GHC, Hugs and NHC, so it'll most likely be present on any sufficiently recent system.
What would be arguments against this proposal?
I've wondered about this myself. ReadP is quite annoying, I believe the only reason it was used originally was because it was in the base package, and hence didn't add any new dependencies. Unfortunately the portable version of ReadP is quite a pain to use. Adding parsec as a dependency is a big step - it would bring parsec into GHC's core packages, and force its distribution with other compilers too. I'm not completely against this, but we should explore alternatives first. Maybe a better solution would be to bring in a simple monadic parser combinator library. Malcolm Wallace has offered to relicense his Text.ParserCombinators.Poly from LGPL to BSD3 for use in Cabal, this might be a good option for us. Any other ideas? Cheers, Simon

Hi
Adding parsec as a dependency is a big step - it would bring parsec into GHC's core packages, and force its distribution with other compilers too. I'm not completely against this, but we should explore alternatives first.
If you choose to use something that isn't the best/most well known, eventually this debate will come around again with Poly vs Parsec. Couldn't you distribute Cabal by taking all required files from all packages (apart from base) and putting that in a tarball? I did this for Catch, and the distribution problems went from a severe headache to nothing. Thanks Neil

Adding parsec as a dependency is a big step - it would bring parsec into GHC's core packages, and force its distribution with other compilers too.
If this happens may I propose having a look at my small parsec patch allowing arbitrary token types not only chars ? Would this cause a big speed penalty? http://mawercer.de/marcweber/haskell/darcs/fparsec/ I've used this to create an experimental command line parser which works fine. Marc Weber

On Mon, Jun 11, 2007 at 01:16:58PM +0200, Marc Weber wrote:
If this happens may I propose having a look at my small parsec patch allowing arbitrary token types not only chars ?
I'm a bit confused as to what your patch does; parsec already supports arbitrary token types, doesn't it? e.g. parse :: GenParser tok () a -> SourceName -> [tok] -> Either ParseError a works with any token type tok. Thanks Ian

On Tue, Jun 12, 2007 at 02:46:07AM +0200, Marc Weber wrote:
works with any token type tok.
You can customize the error position depending on token type.
Eg when parsing command line arguments tracking a line number doesn't make sense.
Ah, OK. Parsec is maintained by Daan Leijen

Hi Thomas, On Sun, Jun 10, 2007 at 03:44:14PM +0200, Thomas Schilling wrote:
now that .cabal files are getting a more complex syntax (due to configurations), I'd really like to use Parsec instead of lots of hand-written and hard to maintain parsing code. By getting rid of ReadP-based field parsing, we'd also get proper error messages for field values.
At the Parsec homepage I says that it's now included by default in GHC, Hugs and NHC,
As Simon said, unfortunately as of 6.8 it won't be included with GHC (as things stand, at least). alex+happy is another alternative, although I don't know off-hand if hugs/nhc/... can build them currently. Have you got a description of the grammar you want to parse? I'd like it if we could say if flag(foo) { other-modules: Foo Bar Baz ghc-options: -wibble } which needs some sort of layout rule for the module list. If we go for something like that then I wouldn't be surprised if this wasn't easiest to parse with just ad-hoc code, which would avoid any dependencies and give good error messages. The parser would only parse it down to something like Cond (Flag "foo) [Pair "other-modules:" "Foo\n Bar\n Baz\n", Pair "ghc-options:" "-wibble\n"] and these would then be further parsed by their own little parsers. Just looked at http://hackage.haskell.org/trac/hackage/wiki/CabalConfigurations I'd also suggest putting the top block of your example on that page in General { ... } (or some better name) and the next two in Flag: fps_in_base { ... } Flag: debug { ... } Thanks Ian

On 6/11/07, Ian Lynagh
Hi Thomas,
On Sun, Jun 10, 2007 at 03:44:14PM +0200, Thomas Schilling wrote:
now that .cabal files are getting a more complex syntax (due to configurations), I'd really like to use Parsec instead of lots of hand-written and hard to maintain parsing code. By getting rid of ReadP-based field parsing, we'd also get proper error messages for field values.
At the Parsec homepage I says that it's now included by default in GHC, Hugs and NHC,
As Simon said, unfortunately as of 6.8 it won't be included with GHC (as things stand, at least).
alex+happy is another alternative, although I don't know off-hand if hugs/nhc/... can build them currently.
Have you got a description of the grammar you want to parse?
I'd like it if we could say if flag(foo) { other-modules: Foo Bar Baz ghc-options: -wibble } which needs some sort of layout rule for the module list. If we go for something like that then I wouldn't be surprised if this wasn't easiest to parse with just ad-hoc code, which would avoid any dependencies and give good error messages.
The parser would only parse it down to something like
Cond (Flag "foo) [Pair "other-modules:" "Foo\n Bar\n Baz\n", Pair "ghc-options:" "-wibble\n"]
and these would then be further parsed by their own little parsers.
Just looked at http://hackage.haskell.org/trac/hackage/wiki/CabalConfigurations
I'd also suggest putting the top block of your example on that page in General { ... } (or some better name) and the next two in Flag: fps_in_base { ... } Flag: debug { ... }
The current syntax looks like this: Name: Foo Version: 0.1 ... Flag name { Description: ... } ... Library { Build-Depends: ... if os(freebsd) && flag(foo) { Build-Depends: ... } some-flag: ... } Executable name { ... } ... } I first parse it into the general syntactic constructs (field-description, section, or if-statement) and then transform it into something like an AST which then gets parsed into the existing back-end structure, depending on the flags and available dependencies. The code however is rather ugly and ad-hoc. But since we won't need the full parsec anyways, maybe i can extract some combinators and will be fine. /Thomas -- "Remember! Everytime you say 'Web 2.0' God kills a startup!" - userfriendly.org, Jul 31, 2006
participants (5)
-
Ian Lynagh
-
Marc Weber
-
Neil Mitchell
-
Simon Marlow
-
Thomas Schilling