Re: [Haskell-cafe] regex-applicative library needs your help! (algorithmic challenge)

18 Sep 2011

      * Anton Tayanovskyy  [2011-09-17 21:46:57-0400]
...
So you want to encode priorities efficiently as far as I understand
from [1]? Could bit-packing combined with prefix elimination do the
trick? Choice boils down to binary choice. Attach a number N to every
execution thread that sits in a given NFA state. When the thread moves
through a binary choice state, it splits into two threads with
N_{left} = 2 * N + 1 and N_{right} = 2 * N. When two threads arrive at
the same NFA state, the machine picks one with greater N and discards
the other, thus giving priority to early over late and left over
right. This makes these numbers grow quickly, but at any point in the
simulation one can identify the common binary prefix of all the thread
numbers and remove it, because it will not be relevant for comparison.
If this is too costly, amortize and do it every K steps instead of on
every step.
Anton
[1] https://github.com/feuerbach/regex-applicative/wiki/Call-For-Help
Just common prefix elimination is not sufficient to put a hard bound on
the memory usage. For example, something like /(a*)|((aa)*)/ would still
require an amount of space linear in the string length (if the string
consists of many a's).

More clever techniques that I tried ("monotonically map lists to
smaller lists") are hard to make correct.

I like the approach of Russ Cox[2]. One of the great ideas there (which I
think he didn't emphasize enough) is to have a queue which allows O(1)
testing whether an element is already there [3]. This solves the problem
with priorities -- the states are guaranteed to be enqueued in the order
of their priorities, and there are no repetitions.

To be able to process the states in the proper order I'll have to give
up Glushkov automaton and probably use something similar to your
construction [4].

[2] http://swtch.com/~rsc/regexp/regexp2.html
[3] http://code.google.com/p/re2/source/browse/util/sparse_array.h
[4] http://t0yv0.blogspot.com/2011/07/combinatory-regular-expressions-in.html

-- 
Roman I. Cheplyaka :: http://ro-che.info/

Re: [Haskell-cafe] regex-applicative library needs your help! (algorithmic challenge)

Roman Cheplyaka