[Haskell-cafe] Fwd: Bug in Parsec.Token

2 Aug 2010

      This is a forward of a message from March 4th.

---------- Forwarded message ----------
From: Derek Elkins 
Date: Thu, Mar 4, 2010 at 9:43 PM
Subject: Re: Bug in Parsec.Token
To: Don Stewart 
Cc: Greg Fitzgerald , Antoine Latter
, "Sittampalam, Ganesh"
, Ian Lynagh ,
libraries@haskell.org

I'm not subscribed to libraries, so this won't go there.

One of the first benchmarks against Parsec 3.0.0 was John MacFarlane's
here: http://www.haskell.org/pipermail/haskell-cafe/2008-March/040258.html

In it, he found Parsec 3.0.0 about 2x slower for his benchmark.  I
can't recreate his benchmark, but I suspect it is a variant of one he
describes here:http://code.google.com/p/pandoc/wiki/Benchmarks

I decided to do a similar benchmark.  I used Parsec 2.1.0.1, Parsec
3.0.1, and Parsec 3.1.0.  Of particular note, building all three
required -only- changing which library pandoc depended on.  No change
to the source was necessary.  All tests in pandoc's test suite passed
for all versions.

Doing that benchmark with a different input file, this file
[http://wpcal.firetree.net/wp-content/plugins/PHP%20Markdown%201.0.1k/PHP%20M...]
concatenated to itself 32 times to produce a 730KB markdown file, I
get the following times for the last three of four runs.

Parsec 2.1.0.1
derek@derek-laptop:~/temp/pandoc-1.3/dist/build/pandoc$ time
./pandoc-2.1.0.1 --strict t.text > /dev/null
real    0m9.863s
user    0m7.792s
sys     0m0.160s

real    0m9.756s
user    0m7.792s
sys     0m0.132s

real    0m10.123s
user    0m7.976s
sys     0m0.168s

Parsec 3.0.1
derek@derek-laptop:~/temp/pandoc-1.3/dist/build/pandoc$ time
./pandoc-3.0.1 --strict t.text > /dev/null
real    0m22.008s
user    0m17.445s
sys     0m0.324s

real    0m21.789s
user    0m17.433s
sys     0m0.160s

real    0m21.754s
user    0m17.677s
sys     0m0.168s

Parsec 3.1.0
derek@derek-laptop:~/temp/pandoc-1.3/dist/build/pandoc$ time
./pandoc-3.1.0 --strict t.text > /dev/null
real    0m10.708s
user    0m8.201s
sys     0m0.168s

real    0m11.078s
user    0m8.401s
sys     0m0.232s

real    0m10.797s
user    0m8.513s
sys     0m0.224s

These results recreate the approximate 2x slowdown that John
originally mentioned between Parsec 2.1.0.1 and Parsec 3.0.  It also
demonstrates that Parsec 3.1.0 is significantly faster than 3.0.1 but
still a little bit slower than Parsec 2.1.0.1.

On Thu, Mar 4, 2010 at 4:39 PM, Don Stewart  wrote:
...
derek.a.elkins:
...
Who is going to maintain "Parsec 4"?
I'm completely against this.  If people absolutely must have exactly
Parsec 2's implementation we can simply copy it into Parsec 3, and the
"compatibility" layer, in that case, will simply -be- Parsec 2.  I've
considered this as a temporary solution for the performance issues
just so people could move to Parsec 3 dependencies, but that should
not be necessary now, and even then I considered it a much less than
ideal solution.
If the community wants to freeze on Parsec 2, then I have no problem
renaming the package, otherwise I think it is both unnecessary and a
waste of effort.
The problem is the ongoing lack of confidence in Parsec 3's performance.
The new release goes some way to addressing this, but I think this has
gone unaddressed for too long.
Can someone address the lingering concern with benchmarks against parsec 2?
-- Don