
Hello everyone, Thank you all for your comments! Those are some very useful ideas. I think I'll try roger's (private) and ChrisK's suggestion first: using the match groups. I'm not sure if the match groups inside the individual regexes will cause much trouble, but we'll see. I imagine I'll have to count parentheses, except when it's followed by a \, except when that \ follows another \, etc. There's probably other situations where a () doesn't count as a group, perhaps when it's followed by a * or +. I'll look into that. If that doesn't work out I'll go for Neil's (from an algorithmic POV beautiful) suggestion. While I understand that some of you suggest I use parsec (or some other mature parser library) I'm pretty sure that's not what I want here. The patterns will almost always be very simple and regular expressions offer an extremely concise way of expressing when a hook should fire. Forcing the user to use full parsers would cause the programs to become much more verbose. Still, Yogurt is flexible enough to allow the user to use parsec if he or she so chooses. Thanks again, Martijn. Mitchell, Neil wrote:
Hi Martijn,
It's not that tricky if you do a regular expression state machine yourself, but that's probably a bit too much work. One way to get a speed up might be to take the regular expressions a,b,c,d and generate a regex a+b+c+d, and one a+b. You can then check any string s against a+b+c+d, if that matches check a+b, if that matches check a. At each stage you eliminate half the regular expressions, which means a match will take log n, where n is the number of regular expressions.
This assumes the underlying regular expression engine constructs a finite state machine, making it O(m) where m is the length of the string to match.
Thanks
Neil