Haskell Platform decision: time to bless parsec 3?

Hey all, This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies. One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis. I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3. Does anyone have concerns with this? -- Don

I would like to see some changes before it becomes a blessed package. I'd love to hear your thoughts on the following ideas: * Get rid of the user state type parameter u. If you want state, set m = StateT s. * Text.Parsec.Prim currently exports its own version of <|> specialized to the ParsecT type constructor. Is there a good reason for this? It clashes when I also import Control.Applicative in my own modules. * Most of the combinators in Text.Parsec.Combinator have types specialized to ParsecT (with a Stream class constraint as consequence) while they could be defined in terms of Applicative only. I think these should be rewritten in terms of Applicative (or Monad if absolutely necessary) whenever possible. Groetjes, Martijn. On 11/6/10 16:18, Don Stewart wrote:
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?

On 6 November 2010 16:24, Martijn van Steenbergen
* Most of the combinators in Text.Parsec.Combinator have types specialized to ParsecT (with a Stream class constraint as consequence) while they could be defined in terms of Applicative only. I think these should be rewritten in terms of Applicative (or Monad if absolutely necessary) whenever possible.
Couldn't an independent package could achieve this goal? If the combinators were defined in Parsec they would be under the Parsec namespace rather than Control.* and Parsec having type specialized versions might produce better error messages. Ross Paterson already has an alternative implementation of the Perm combinators on Hackage that are defined in solely with Applicative (action-permutations).

Apologies for my last message which was rather garbled due to squinting at tiny text. For infomation, here's a list of most of the combinators with the Applicative / Alternative / Monad signatures, I've defined libraries of them myself a couple of times. choice :: Alternative f => [f a] -> f a count :: Applicative f => Int -> f a -> f [a] between :: Applicative f => f open -> f close -> f a -> f a option :: Alternative f => a -> f a -> f a optionMaybe :: Alternative f => f a -> f (Maybe a) optional :: Alternative f => f a -> f () skipMany :: Alternative f => f a -> f () skipMany1 :: Alternative f => f a -> f () -- | 'many1' an alias for Control.Applicative 'some'. many1 :: Alternative f => f a -> f [a] sepBy :: Alternative f => f a -> f b -> f [a] sepBy1 :: Alternative f => f a -> f b -> f [a] sepEndBy :: Alternative f => f a -> f b -> f [a] sepEndBy1 :: Alternative f => f a -> f b -> f [a] manyTill :: Alternative f => f a -> f b -> f [a] manyTill1 :: Alternative f => f a -> f b -> f [a] chainl1 :: MonadPlus m => m a -> m (a -> a -> a) -> m a chainr1 :: MonadPlus m => m a -> m (a -> a -> a) -> m a chainl :: MonadPlus m => m a -> m (a -> a -> a) -> a -> m a chainr :: MonadPlus m => m a -> m (a -> a -> a) -> a -> m a

On 06/11/2010 17:47, Stephen Tetley wrote:
Couldn't an independent package could achieve this goal?
If the combinators were defined in Parsec they would be under the Parsec namespace rather than Control.* and Parsec having type specialized versions might produce better error messages.
Not just error messages, either - it's room for further optimisation possibilities any time a combinator might expand to an infinite grammar, for example. And Parsec already tells you at runtime when it has certain kinds of bad grammar that it can't spot statically. -- flippa@flippac.org

On Sat, Nov 6, 2010 at 5:24 PM, Martijn van Steenbergen
I would like to see some changes before it becomes a blessed package. I'd love to hear your thoughts on the following ideas:
* Get rid of the user state type parameter u. If you want state, set m = StateT s.
I'm not really for this but not strongly against. The only concrete argument I have is that then the end-user would have to think about the correct way to layer to monads so that the state backtracks at the same time as the parser. There also might be performance implications for those wishing to use the state (but I'm guessing that it would get better for those not using it, which is likely the common case).
* Text.Parsec.Prim currently exports its own version of <|> specialized to the ParsecT type constructor. Is there a good reason for this? It clashes when I also import Control.Applicative in my own modules.
I doubt there is a good reason.
* Most of the combinators in Text.Parsec.Combinator have types specialized to ParsecT (with a Stream class constraint as consequence) while they could be defined in terms of Applicative only. I think these should be rewritten in terms of Applicative (or Monad if absolutely necessary) whenever possible.
Assuming we move the Parsec 'many' implementation into the Alternative class definition, this should have no/little impact on performance for all the combinators mentioned by Stephen. Maybe they could be specialized to internal Parsec structures in the future, but at the moment they really aren't. My previous Parsec benchmarks have been somewhat ad-hoc, so I would like a better benchmark suite before doing this.
Groetjes,
Martijn.
On 11/6/10 16:18, Don Stewart wrote:
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Hello,
For some alternative design choices you might also be interested in
Parsimony (http://hackage.haskell.org/package/parsimony).
It is similar to a trimmed-down Parsec 2, but it supports token
streams other then lists (e.g., it can parse directly from
byte-strings).
Parsers supports user state but, unlike Parsec, they have only a
single type parameter, which looks much nicer.
The basic idea is simple: instead of parameterizing on the type of
tokens (like Parsec), Parsimony parameterizes on the type of token
streams instead.
The type of tokens are computed from the type of token streams
(details are in Parsimony.Stream).
Because parsers already pass around the token stream as a piece of
state, it is easy to add user state by just defining a new type of
token stream which has the user state as an extra component (for
details see Parsimony.UserState).
Hope that this helps,
-Iavor
On Sat, Nov 6, 2010 at 3:11 PM, Antoine Latter
On Sat, Nov 6, 2010 at 5:24 PM, Martijn van Steenbergen
wrote: I would like to see some changes before it becomes a blessed package. I'd love to hear your thoughts on the following ideas:
* Get rid of the user state type parameter u. If you want state, set m = StateT s.
I'm not really for this but not strongly against. The only concrete argument I have is that then the end-user would have to think about the correct way to layer to monads so that the state backtracks at the same time as the parser.
There also might be performance implications for those wishing to use the state (but I'm guessing that it would get better for those not using it, which is likely the common case).
* Text.Parsec.Prim currently exports its own version of <|> specialized to the ParsecT type constructor. Is there a good reason for this? It clashes when I also import Control.Applicative in my own modules.
I doubt there is a good reason.
* Most of the combinators in Text.Parsec.Combinator have types specialized to ParsecT (with a Stream class constraint as consequence) while they could be defined in terms of Applicative only. I think these should be rewritten in terms of Applicative (or Monad if absolutely necessary) whenever possible.
Assuming we move the Parsec 'many' implementation into the Alternative class definition, this should have no/little impact on performance for all the combinators mentioned by Stephen. Maybe they could be specialized to internal Parsec structures in the future, but at the moment they really aren't.
My previous Parsec benchmarks have been somewhat ad-hoc, so I would like a better benchmark suite before doing this.
Groetjes,
Martijn.
On 11/6/10 16:18, Don Stewart wrote:
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Sat, Nov 6, 2010 at 3:11 PM, Antoine Latter
Assuming we move the Parsec 'many' implementation into the Alternative class definition, this should have no/little impact on performance for all the combinators mentioned by Stephen.
I can't speak for Parsec, but I certainly have a local definition of "many" in attoparsec, because the Alternative version isn't inlined. Makes a biggish difference to performance, though I do not recall the magnitude (2x?).

On Sun, Nov 7, 2010 at 4:53 AM, Bryan O'Sullivan
On Sat, Nov 6, 2010 at 3:11 PM, Antoine Latter
wrote: Assuming we move the Parsec 'many' implementation into the Alternative class definition, this should have no/little impact on performance for all the combinators mentioned by Stephen.
I can't speak for Parsec, but I certainly have a local definition of "many" in attoparsec, because the Alternative version isn't inlined. Makes a biggish difference to performance, though I do not recall the magnitude (2x?).
Ah, drat. However it looks like Parsec.Prim.many is not inlined. I thought at the instance declaration site I could include an INLINE pragma. Does that not work as one would hope? Antoine

On Sun, Nov 7, 2010 at 2:35 AM, Antoine Latter
Ah, drat. However it looks like Parsec.Prim.many is not inlined.
I wouldn't even worry about this. I think that apart from the Church encoding transformation, Parsec3 has seen almost no performance work.
I thought at the instance declaration site I could include an INLINE pragma. Does that not work as one would hope?
It should. GHC 6.12 and earlier do squirrelly things with INLINE declarations on instances, though. 7 is better.

| Ah, drat. However it looks like Parsec.Prim.many is not inlined. | | I thought at the instance declaration site I could include an INLINE | pragma. Does that not work as one would hope? You can certainly write {-# INLINE #-} pragmas on the methods of an instance declaration. If you think that doing so does not work in GHC 7, please file a bug report. Thanks! Simon

The Alternative class has 'many' as a method since at least base-4.2.0.1. I think Doaitse Swierstra requested the change as uu-parsing had an optimized version and could not otherwise use Applicative (even though it inspired the notation).

On Nov 6, 2010, at 12:24 PM, Martijn van Steenbergen wrote:
I would like to see some changes before it becomes a blessed package. I'd love to hear your thoughts on the following ideas:
* Get rid of the user state type parameter u. If you want state, set m = StateT s. * Text.Parsec.Prim currently exports its own version of <|> specialized to the ParsecT type constructor. Is there a good reason for this? It clashes when I also import Control.Applicative in my own modules. * Most of the combinators in Text.Parsec.Combinator have types specialized to ParsecT (with a Stream class constraint as consequence) while they could be defined in terms of Applicative only. I think these should be rewritten in terms of Applicative (or Monad if absolutely necessary) whenever possible.
These are useful suggestions for Parsec. But the proposal is not to add Parsec to the platform, but just to upgrade to v. 3. The concerns apply equally well to v. 2, which is already in the platform. I'm all for upgrading to Parsec 3, which is more general and reportedly has comparable performance now, and pursuing further work on Parsec through working with the Parsec maintainer (i.e. outside of the libraries process.) Cheers, Sterl.

On 6 November 2010 15:18, Don Stewart
Hey all,
This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies.
One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis.
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
Yes. I think that if a package has a significant discontinuity then it has to be reconsidered at least to some degree. In the case of parsec 2 and 3, initially parsec 3 was an experimental new version, by different authors. It was not initially clear if it would be an obvious replacement, if it was functionally correct and if the performance or documentation was up to scratch compared to version 2. Personally I would be satisfied if the current maintainer(s) would state that they believe the current parsec 3 release is up to standard and that they believe it should become the new version in the platform. Duncan

+1
It would let us finally start healing the rift in the libraries and with
Parsec 3.1 the performance difference is no longer a divisive concern.
-Edward
On Sat, Nov 6, 2010 at 11:18 AM, Don Stewart
Hey all,
This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies.
One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis.
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
-- Don _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

As a further issue, could the examples that were included with Daan Leijen's original distribution be added to the package? I suspect that some peoples dislike of Parsec's try is exacerbated by them not using the Token and Language modules to usefully handle white space and tokenizing. Daan's examples are very handy to copy good practice from.

On 6 November 2010 15:18, Don Stewart <dons at galois.com> wrote:
Hey all,
This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies.
One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis.
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
Yes. I think that if a package has a significant discontinuity then it has to be reconsidered at least to some degree.
In the case of parsec 2 and 3, initially parsec 3 was an experimental new version, by different authors. It was not initially clear if it would be an obvious replacement, if it was functionally correct and if the performance or documentation was up to scratch compared to version 2.
Personally I would be satisfied if the current maintainer(s) would state that they believe the current parsec 3 release is up to standard and that they believe it should become the new version in the platform.
Duncan
I would say Parsec 3 is not less up to standard than Parsec 2. The performance is comparable to Parsec 2, and there are still some things that can be done, as were mentioned in this thread alone. If nothing else, the main example for Parsec 3 being slow, was rerun with Parsec 3.1 and was comparable to Parsec 2. As far as correctness, the bug reports I've received have mostly been relatively minor and usually have applied to Parsec 2 as well. However, a good unit/regression test suite is definitely something that Parsec (any version) could use. There is, of course, no shortage of systems tests of Parsec that would be quite conspicuous if they failed. I have not gotten a single report along the lines of "I've upgraded to Parsec 3 and now my parser doesn't work." Parsec 2 has no documentation at all. The Parsec letter is for Parsec 1, but is still quite relevant to all versions of Parsec. The Haddock documentation was added in Parsec 3. Suffice it to say, Parsec 3 is far better than Parsec 2 in this regard. The new features of Parsec have (non-trivial) Haddock documentation, but a documented, significant example using the new features would be a good addition. As far as Don's concerns with regard to stability: As suggested above, most of the emails I've received have been minor bugs that are often in Parsec 2 as well, or feature requests, and there haven't been that many of either. I have no big changes in mind. The next big change I forsee is dropping the compatibility layer at some point. Other than that, I can see the Stream class changing significantly, but I believe that can be done in a way that is mostly transparent to users. That is just speculation at this point. While I agree that currently the monad transformer aspect of Parsec 3 is not a common requirement, the ability to parse ByteStrings and other types (like Text) is certainly something that would be beneficial to many. Parsec 3 runs Parsec 2 code virtually unchanged. The only particularly good argument I've seen against having Parsec 3 (v. Parsec 2) in the Platform is the extensions issue. I'm less involved in the Haskell community than I was years ago and I expect that trend to continue a little bit more, and lately I haven't been the best maintainer. I wouldn't want the me right now to be the maintainer for a package in the Haskell Platform, so I've asked Antoine Latter to take up maintainership of Parsec 3 and he has agreed. He's certainly familiar with the code and has been involved in Parsec 3 almost from the beginning. I don't see any trouble during or after the transition. I'll push a minor version upgrade with some pending patches, and then after that it will be Antoine's show. He also doesn't have any big changes planned.

Am 12.11.2010 01:32, schrieb Derek Elkins:
I would say Parsec 3 is not less up to standard than Parsec 2.
I think, the question should be. Should we bless a (yet missing) Parsec3 library (without compatibility layer) in addition to the current parsec 2 library?
The performance is comparable to Parsec 2, and there are still some things that can be done, as were mentioned in this thread alone. If nothing else, the main example for Parsec 3 being slow, was rerun with Parsec 3.1 and was comparable to Parsec 2.
As far as correctness, the bug reports I've received have mostly been relatively minor and usually have applied to Parsec 2 as well. However, a good unit/regression test suite is definitely something that Parsec (any version) could use. There is, of course, no shortage of systems tests of Parsec that would be quite conspicuous if they failed. I have not gotten a single report along the lines of "I've upgraded to Parsec 3 and now my parser doesn't work."
the compatibility is fine, but mixing both interfaces (Text.Parsec.* and Text.ParserCombinators.Parsec.*) should not be encouraged (and if I don't mix I do not need a compatibility layer).
Parsec 2 has no documentation at all. The Parsec letter is for Parsec 1, but is still quite relevant to all versions of Parsec. The Haddock documentation was added in Parsec 3. Suffice it to say, Parsec 3 is far better than Parsec 2 in this regard. The new features of Parsec have (non-trivial) Haddock documentation, but a documented, significant example using the new features would be a good addition.
the new parsec2 package seems to have the documentation taken from parsec 3.
As far as Don's concerns with regard to stability: As suggested above, most of the emails I've received have been minor bugs that are often in Parsec 2 as well, or feature requests, and there haven't been that many of either. I have no big changes in mind. The next big change I forsee is dropping the compatibility layer at some point. Other than
How should we drop the compatibility layer once it is part of the HP without breaking much code? (Omit the compatibility layer in the first place and ask my initial question!)
that, I can see the Stream class changing significantly, but I believe that can be done in a way that is mostly transparent to users. That is just speculation at this point.
While I agree that currently the monad transformer aspect of Parsec 3
I wonder if Parsec3 is well designed wrt user state, since a user state could be part of the underlying monad. (Could/Would the user state be dropped if compatibility to parsec 2 would be no issue?)
is not a common requirement, the ability to parse ByteStrings and other types (like Text) is certainly something that would be beneficial to many. Parsec 3 runs Parsec 2 code virtually unchanged.
Still, I had a hard time to change my Parsec 2 code to use Bytestrings instead of Strings via Parsec 3.
The only particularly good argument I've seen against having Parsec 3 (v. Parsec 2) in the Platform is the extensions issue.
Unfortunately, most parsers use extensions (even Text.ParserCombinators.ReadP), which makes it difficult for other haskell compilers to profit from the haskell code base. In fact, I wished that the parsec2 modules requiring extensions would be in an extra package. But it seems to be difficult to avoid extensions (as also Parsimony shows) just to support more general streams.
I'm less involved in the Haskell community than I was years ago and I expect that trend to continue a little bit more, and lately I haven't been the best maintainer. I wouldn't want the me right now to be the maintainer for a package in the Haskell Platform, so I've asked Antoine Latter to take up maintainership of Parsec 3 and he has agreed. He's certainly familiar with the code and has been involved in Parsec 3 almost from the beginning. I don't see any trouble during or after the transition. I'll push a minor version upgrade with some pending patches, and then after that it will be Antoine's show. He also doesn't have any big changes planned.
I'm strongly against putting parsec 3 as is into the HP. Creating it as new major of parsec caused enough trouble, moving it into the HP is no good remedy (in my eyes). Christian

On 10-11-11 07:32 PM, Derek Elkins wrote:
On 6 November 2010 15:18, Don Stewart<dons at galois.com> wrote:
Hey all,
This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies.
One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis.
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
Yes. I think that if a package has a significant discontinuity then it has to be reconsidered at least to some degree.
In the case of parsec 2 and 3, initially parsec 3 was an experimental new version, by different authors. It was not initially clear if it would be an obvious replacement, if it was functionally correct and if the performance or documentation was up to scratch compared to version 2.
Personally I would be satisfied if the current maintainer(s) would state that they believe the current parsec 3 release is up to standard and that they believe it should become the new version in the platform.
Duncan
I would say Parsec 3 is not less up to standard than Parsec 2.
I support the Parsec upgrade proposal. All my code that depends on Parsec has long ago been upgraded to Parsec 3, and I had no problems to report.

I still favor parsec 2 over parsec 3 because a) parsec 3 is no longer haskell98 (as major parts of parsec 2 are) b) I don't like the compatibility layer (modules with re-exports) of parsec 3 for parsec 2 Without the compatibility layer (b) and making the package a new major version of parsec, we would probably not discuss this issue. I think the maintainers of "parsec 3" should create new package "parsec3" without the compatibility layer. A new package parsec2 was already created. There are simply no blessed parser packages! The problem is that so many package simply have "parsec" as dependency, otherwise I would vote for removing parsec from HP (or vote for parsec2). Christian Am 06.11.2010 16:18, schrieb Don Stewart:
Hey all,
This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies.
One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis.
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
Does anyone have concerns with this?
-- Don

On 11/08/10 07:51, Christian Maeder wrote:
I still favor parsec 2 over parsec 3 because
a) parsec 3 is no longer haskell98 (as major parts of parsec 2 are)
I can't find my notes (might've disappeared in a system crash), but I went through all the HP packages, and I think each of the extensions that parsec 3 uses was used in some place in some other Platform package(s) too (even after excluding obviously nonportable things like template-haskell and the GHC-specific part of 'array'). And not all of parsec 2 is haskell98 (its cabal file mentions ExistentialQuantification and PolymorphicComponents). In addition to parsec 2's extensions, parsec 3 uses MultiParamTypeClasses, FlexibleInstances, FlexibleContexts, DeriveDataTypeable. I guess the first three of those are pretty tame (?) (or at least they seem to be used in miscellaneous places..). And DeriveDataTypeable is used by extensible-exceptions, which is a rather central library (whether you get it from 'base' or 'extensible-exceptions', either way).
I think the maintainers of "parsec 3" should create new package "parsec3" without the compatibility layer. A new package parsec2 was already created.
That's an idea/vision that I think hasn't been brought up before (I might have forgot), in all our discussions of Parsec -- to make new names for both parsec2 and parsec3, and drop the compatibility layer in parsec3. (AFAIK there's rarely a reason to share parser data-types widely between modules / in interfaces, which in other cases makes a major benefit for having compatibility-layers and not just two separate library versions.) -Isaac

On Sat, 6 Nov 2010 08:18:59 -0700, Don Stewart
Hey all,
This is a loose end in the package policy situation: when the HP has a major upgrade, the policy is to do all major upgrades for any packages contained in the HP, as long as they don't add new dependencies.
One exception to this rule has been parsec, where parsec 2 was considered "blessed" on an ad hoc basis.
I propose we agree to remove this ad hoc rule, and thus the HP will ship with parsec 3.
I support the intergration of parsec3 into the platform. -- Nicolas Pouillard http://nicolaspouillard.fr
participants (18)
-
Antoine Latter
-
Bryan O'Sullivan
-
Christian Maeder
-
Derek Elkins
-
Don Stewart
-
Duncan Coutts
-
Edward Kmett
-
Iavor Diatchki
-
Isaac Dupree
-
Kazu Yamamoto
-
Mario Blažević
-
Martijn van Steenbergen
-
Nicolas Pouillard
-
Peter Simons
-
Philippa Cowderoy
-
Simon Peyton-Jones
-
Stephen Tetley
-
Sterling Clover