New subject: Haskell-Cafe Digest, Vol 180, Issue 32

31 Aug 2018


      I got 4.7s for similar amount of data in 2013.
However I was pretty sure that fully inlined implementation could
potentially go 5x faster.
http://hackage.haskell.org/package/hPDB

Please check xeno XML parser benchmarks for another example.
https://hackage.haskell.org/package/xeno
On Fri, 31 Aug 2018 at 14:41,  wrote:
...
Send Haskell-Cafe mailing list submissions to
        haskell-cafe@haskell.org
To subscribe or unsubscribe via the World Wide Web, visit
        http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
or, via email, send a message with subject or body 'help' to
        haskell-cafe-request@haskell.org
You can reach the person managing the list at
        haskell-cafe-owner@haskell.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Haskell-Cafe digest..."
Today's Topics:
1. Re: HDBC packages looking for maintainer (Tobias Dammers)
   2. Re: Alternative instance for non-backtracking parsers
      (Olaf Klinke)
   3. Re: Alternative instance for non-backtracking parsers
      (Bardur Arantsson)
---------- Forwarded message ----------
From: Tobias Dammers 
To: haskell-cafe@haskell.org
Cc:
Bcc:
Date: Thu, 30 Aug 2018 15:24:04 +0200
Subject: Re: [Haskell-cafe] HDBC packages looking for maintainer
Hi,
I'd be interested. I've used HDBC on a few projects, and my yeshql
library was originally built with HDBC as the only backend. It would be
a terrible shame to see this bitrot.
Cheers,
Tobias (tdammers on github etc.)
...
Hi all,
I've been the maintainer for some of the HDBC packages for a while now.
Sadly, I've mostly neglected them due to lack of time and usage. While
On Mon, Aug 13, 2018 at 12:07:38PM +0200, Erik Hesselink wrote:
the
...
packages mostly work, there are occasional pull requests and updates for
new compiler versions.
Because of this I'm looking for someone who wants to take over HDBC and
related packages [1]. If you use HDBC and would like to take over
maintainership, please let me know and we can get things set up.
Regards,
Erik
[1] https://github.com/hdbc
...
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
--
Tobias Dammers - tdammers@gmail.com
---------- Forwarded message ----------
From: Olaf Klinke 
To: PY 
Cc: haskell-cafe 
Bcc:
Date: Thu, 30 Aug 2018 20:21:07 +0200
Subject: Re: [Haskell-cafe] Alternative instance for non-backtracking
parsers
...
Hello, Olaf. I have some distrust of elegant solutions (one of them are
C.P. libs).
I have a program that parses several CSV files, one of them 50MB in size,
and writes its result as HTML. When I started optimizing, the execution
time was 144 seconds. Profiling (thanks to Jasper Van der Jeugt for writing
profiteur!) revealed that most of the time was spent parsing and
postprocessing the 50MB CSV file. Changing the data structure of the
postprocessing stage cut down the execution time to 32 seconds, but still
the majority is spent on parsing.
Then I realized that (StateT String Maybe) is a parser which conveniently
has all the class instances one needs, most notably its Alternative
instance make it a backtracking parser. After defining a few combinators I
was able to swap out my megaparsec parser against the new parser, which
slashed execution time in half. Now most of the parsing time is dedicated
to transforming text to numbers and dates. I doubt that parsing time can be
reduced much further [*]. The new parser was identical to the old parser,
only the combinators now come from another module. That is the elegant
thing about monadic parser libraries.
I will now use the fast parser by default, and if it returns a Nothing,
the program will suggest a command line flag that switches to the original
megaparsec parser, exactly telling the user where the parse failed and why.
I am not sure whether there is another family of parsers that have
interfaces so similar that switching from one package to another is as
effortless as monadic parsers.
Cheers
Olaf
[*] To the parser experts on this list: How much time should a parser take
that processes a 50MB, 130000-line text file, extracting 5 values (String,
UTCTime, Int, Double) from each line?
---------- Forwarded message ----------
From: Bardur Arantsson 
To: haskell-cafe@haskell.org
Cc:
Bcc:
Date: Thu, 30 Aug 2018 21:43:55 +0200
Subject: Re: [Haskell-cafe] Alternative instance for non-backtracking
parsers
On 30/08/2018 20.21, Olaf Klinke wrote:
...
...
Hello, Olaf. I have some distrust of elegant solutions (one of them are
C.P. libs).
[*] To the parser experts on this list: How much time should a parser
take that processes a 50MB, 130000-line text file, extracting 5 values
(String, UTCTime, Int, Double) from each line?
Not an expert, but for something as (relatively!) standard as CSV, I'd
probably go for a specialized solution like 'cassava', which seems like
it does quite well according to https://github.com/haskell-perf/csv
Based purely the lines/second numbers on that page and the number you've
given, I'd guesstimate that your parsing could potentially be as fast as
(3.185ms / 1000 lines) * 130000 lines = 414.05ms = 0.4 s.
(Of coure that still doesn't account for extracting the Int, Double,
etc., but there are also specialized solutions for that which should be
pretty hard to beat, see e.g. bytestring-lexing.)
It's also probably a bit less elegant than a generic parsec-like thing,
but that's to be expected for a more special-case solution.
Regards,
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Haskell-Cafe Digest, Vol 180, Issue 32

Michal J Gajda

Ben Franksen

tags

participants (2)