
Am Mittwoch, 21. Dezember 2005 08:18 schrieb Branimir Maksimovic:
From: Bulat Ziganshin
Reply-To: Bulat Ziganshin
To: "Branimir Maksimovic" CC: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] Substring replacements Date: Tue, 20 Dec 2005 23:55:22 +0300 Hello Branimir,
Tuesday, December 20, 2005, 9:48:48 PM, you wrote:
BM> I've finally performed test on amd64 and result is a same as on intel. BM> KMP always wins. So KMP is best suited for non indexed strings BM> and I guess should be used in library as prefered search/replace method. BM> This test favors straightforward search.
i'm 90% sure that straightforward method must be faster for one-time searches.
I'm far less sure of that. If you have a really short search-pattern, I think that probably straightforward search is indeed faster (at least, if the search-pattern isn't supplied at compile-time). But if you have a pattern of length 100, say, and lots of relatively long partial matches, chances are that KMP will move on something like 50 chars at once, which would have to be re-checked by straightforward. I've no idea, when that would outweigh the extra time of building the KMP-arrays, though. In my test -- extremely unrealistic, but provides a more or less worst case example -- straightforward must make roughly half a million character comparisons to pass each 'c', while KMP does 2000 (+/-2) (one way through the array and back), so there are situations where KMP definitely is faster. But ordinarily, on my computer, your version of straightforward is 10-15% faster than KMP (search-patterns are now supplied on the command line -- which has no big impact; searched string is " able sea..."; all compiled with -O2; NOINLINE makes no difference -- at least with optimisations on --; without optimisations, KMP suffers far worse than straightforward).
KMP is O(m) while straightforward is O(m*n).
Where m is the length of the input and n is the length of the searched-for pattern, I think? But these are worst-case complexities, I believe, ordinarily, straightforward will be O(m), too.
your test may give better results with KMP algorithm just
because you repeat the same search many times and it was automatically "run-time compiled" as described on the wiki page about KMP
My results seem to witness against that.
My test favors straightforward, in any other case KMP wins by order of magnitude.
Can you give example tests?
I think that straightfoirward is better then KMP with optimisations off is due more complex program.
try to add
{-# NOINLINE replace #-}
to both programs and repeat comparision
Greetings, Bane.
Cheers, Daniel