Re: Incremental Regex Parsing with Fixed Strings

1 Mar 2019

      Thank you so much for pointing me to this paper. What an absolute gem it
is! Efficiently evaluating regex with a recursive ADT instead of a graph
puts me one step closer to a solution. In my problem domain, regex
frequently are mostly sequences at the top level. That is, these are common:

    The number [0-9]* is a good number
    I have [0-9]* pupp(ies|y).

This are not:

    (Hello|from|the|other|side)*

Using the paper's terminology, I think I should be able to have a
`FingerTree X REG` (for some X I'm not sure of), where every top-level
sequence is represented as adjacent elements in the finger tree. I'm not
sure if this will actually work, but it seems plausible that it will.

On Fri, Mar 1, 2019 at 3:55 AM Sergey Vinokurov  wrote:
...
Modifying regular expressions reminds me of the great functional pearl
called 'A Play on Regular Expressions'. Perhaps this is the approach you
were thinking about?
Sergey
On 28/02/2019 18:43, Andrew Martin wrote:
...
Greetings,
In Dan Piponi's blog post Fast Incremental Regular Expression Matching
with Monoids [1], he outlines a way to take advantage of fingertrees to
perform increment regex matching. In his article, the regular expression
(more accurately, its equivalent DFA) is fixed, and the strings we are
matching on are modified incrementally. What I'm interested in is the
opposite: What if the regular expression is modified incrementally and
the string we are testing is static?
Thinking about this, it seems to run into problems immediately.
Typically, to evaluate a regular expression, I would convert it to a NFA
(and then maybe to a DFA). But there is no way to do incremental graph
modification on the resulting graphs. (Or is there?). And even if there
were a way, it seems unclear how one would go about exploiting its
incremental nature. To provide an example, consider the following regex
and sample string:
Regex: Lost (dog|cat) loo for home
Sample: Lost dog looking for home
In the regex, I'm currently in the middle of typing "looking", so it
doesn't match the sample string. But large parts of it do. So when I
update the regex by typing "k", I'd like to be able to not redo this
work. Maybe there's a certain subset of regex that's amenable to this
kind of incremental evaluation. Maybe not. If anyone has any additional
insights or can point me to any research, I would appreciate it greatly.
Thanks.
[1]
http://blog.sigfpe.com/2009/01/fast-incremental-regular-expression.html
--
-Andrew Thaddeus Martin
_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- 
-Andrew Thaddeus Martin