Re: [Haskell-cafe] regex-applicative library needs your help! (algorithmic challenge)

14 Sep 2011


      Hi,
I don't see how fallback to NFA simulation is really a failure wrt DoS
attacks. It's not exponential in time or memory, just linear memory
(in size of regex) instead of constant, and slower than DFA.

On Wed, Sep 14, 2011 at 1:29 AM, Roman Cheplyaka  wrote:
...
* Chris Kuklewicz  [2011-09-13 08:21:55+0100]
...
I wrote regex-tdfa, the efficient (though not yet lightning fast) Posix-like engine.
You are not the first to want an efficient Perl-like engine.  The answer you
seek flows from Russ Cox (though in C++):
...
http://google-opensource.blogspot.com/2010/03/re2-principled-approach-to-reg...
...
http://code.google.com/p/re2/
Quoting relevant bit:
...
It also finds the leftmost-first match, the same match that Perl would, and
can return submatch information. The one significant exception is that RE2
drops support for backreferences¹ and generalized zero-width assertions,
because they cannot be implemented efficiently.
Hi Chris, thanks for the response.
There's one thing about Cox's work that I don't understand. On the
page [1] he mentions that the algorithm falls back to direct NFA
simulation (search for "If all else fails" on that page).
This defeats his goal of defending against DoS attacks (the second
paragraph of that page).
And I wouldn't be comfortable with an algorithm that is worst-case
exponential either.
Then there's another issue: I specifically want a combinator library,
and not every automaton-based algorithm can serve this purpose in a
statically typed language (unless there's some clever trick I don't
know).
[1]: http://swtch.com/~rsc/regexp/regexp3.html
--
Roman I. Cheplyaka :: http://ro-che.info/
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
-- 
Eugene Kirpichov
Principal Engineer, Mirantis Inc. http://www.mirantis.com/
Editor, http://fprog.ru/