
6 Feb
2009
6 Feb
'09
8:40 p.m.
allbery:
On 2009 Feb 5, at 10:26, Eugene Kirpichov wrote:
My benchmark (parsing a huge logfile with a regex like "GET /foo.xml.*fooid=([0-9]++).*barid=([0-9]++)") shows that plain PCRE is the fastest one (I tried PCRE, PCRE-light and TDFA; DFA can't do capturing groups at all, TDFA was abysmally slow (about 20x slower than PCRE), and it doesn't support ++), but maybe have I missed any blazing-fast package?
I think dons (copied) will want to hear about this; pcre-light is supposed to be a fast lightweight wrapper for the PCRE library, and if it's slower than PCRE then something is likely to be wrong somewhere.
Shouldn't be slower (assuming you're using bytestrings). -- Don