
On Fri, 2007-08-17 at 14:49 +0100, ChrisK wrote:
Attached is the Boyer-Moore algorithm implemented for strict and lazy bytestrings (and combinations thereof). It finds all the overlapping instances of the pattern inside the target.
Fantastic.
I have performance tuned it.
Even better :-)
But the performance for searching a strict bytestring is better then for a lazy bytestring (even if they only had a single strict chunk), which almost certainly means I was not clever enough to get GHC to produce the optimal code.
Lazy bytestrings have one more indirection to get at the actual data. I'm planning to eliminate this extra indirection at some point so it'll be interesting to see if that help searching significantly.
There is much more description in the module's haddock header.
Hopefully Don or other ByteString experts/maintainers can tweak this even further.
Yes, I'll look at getting this into the bytestring package (though not for about a week as I'm away). Duncan