
I've finally performed test on amd64 and result is a same as on intel. KMP always wins. So KMP is best suited for non indexed strings and I guess should be used in library as prefered search/replace method. This test favors straightforward search. [bmaxa@devel64 myhaskell]$ time ./KMP Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m10.783s user 0m10.769s sys 0m0.016s [bmaxa@devel64 myhaskell]$ time ./straightforward Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m11.769s user 0m11.741s sys 0m0.028s [bmaxa@devel64 myhaskell]$ uname -a Linux devel64.office.kom 2.6.14-skas3-v8.2 #2 Fri Nov 11 21:19:36 CET 2005 x86_64 x86_64 x86_64 GNU/Linux Greetings, Bane. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

Hello Branimir, Tuesday, December 20, 2005, 9:48:48 PM, you wrote: BM> I've finally performed test on amd64 and result is a same as on intel. BM> KMP always wins. So KMP is best suited for non indexed strings BM> and I guess should be used in library as prefered search/replace method. BM> This test favors straightforward search. i'm 90% sure that straightforward method must be faster for one-time searches. your test may give better results with KMP algorithm just because you repeat the same search many times and it was automatically "run-time compiled" as described on the wiki page about KMP try to add {-# NOINLINE replace #-} to both programs and repeat comparision -- Best regards, Bulat mailto:bulatz@HotPOP.com

From: Bulat Ziganshin
Reply-To: Bulat Ziganshin To: "Branimir Maksimovic" CC: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] Substring replacements Date: Tue, 20 Dec 2005 23:55:22 +0300 Hello Branimir,
Tuesday, December 20, 2005, 9:48:48 PM, you wrote:
BM> I've finally performed test on amd64 and result is a same as on intel. BM> KMP always wins. So KMP is best suited for non indexed strings BM> and I guess should be used in library as prefered search/replace method. BM> This test favors straightforward search.
i'm 90% sure that straightforward method must be faster for one-time searches.
KMP is O(m) while straightforward is O(m*n). your test may give better results with KMP algorithm just
because you repeat the same search many times and it was automatically "run-time compiled" as described on the wiki page about KMP
My test favors straightforward, in any other case KMP wins by order of magnitude. I think that straightfoirward is better then KMP with optimisations off is due more complex program.
try to add
{-# NOINLINE replace #-}
to both programs and repeat comparision
These are tests: No optimisations (no -O): Intel hyperthreaded,windows $ time ./KMP Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m34.766s user 0m0.015s sys 0m0.000s bmaxa@MAXA ~/tutorial $ time ./straightforward Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m14.719s user 0m0.031s sys 0m0.000s AMD 64 bit: [bmaxa@devel64 myhaskell]$ time ./KMP Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 1m58.066s user 1m57.939s sys 0m0.128s [bmaxa@devel64 myhaskell]$ time ./straightforward Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m41.565s user 0m41.527s sys 0m0.040s with optimisations (-O): Intel hyperthreaded,windows $ time ./KMP Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m8.625s user 0m0.015s sys 0m0.015s bmaxa@MAXA ~/tutorial $ time ./straightforward Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m11.735s user 0m0.015s sys 0m0.000s AMD 64 bit, linux: [bmaxa@devel64 myhaskell]$ time ./KMP Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m10.546s user 0m10.529s sys 0m0.016s [bmaxa@devel64 myhaskell]$ time ./straightforward Working:seasearch replace able seaseasearch baker seasearch charlie True Done real 0m11.796s user 0m11.785s sys 0m0.012s Greetings, Bane. _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

Am Mittwoch, 21. Dezember 2005 08:18 schrieb Branimir Maksimovic:
From: Bulat Ziganshin
Reply-To: Bulat Ziganshin
To: "Branimir Maksimovic" CC: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] Substring replacements Date: Tue, 20 Dec 2005 23:55:22 +0300 Hello Branimir,
Tuesday, December 20, 2005, 9:48:48 PM, you wrote:
BM> I've finally performed test on amd64 and result is a same as on intel. BM> KMP always wins. So KMP is best suited for non indexed strings BM> and I guess should be used in library as prefered search/replace method. BM> This test favors straightforward search.
i'm 90% sure that straightforward method must be faster for one-time searches.
I'm far less sure of that. If you have a really short search-pattern, I think that probably straightforward search is indeed faster (at least, if the search-pattern isn't supplied at compile-time). But if you have a pattern of length 100, say, and lots of relatively long partial matches, chances are that KMP will move on something like 50 chars at once, which would have to be re-checked by straightforward. I've no idea, when that would outweigh the extra time of building the KMP-arrays, though. In my test -- extremely unrealistic, but provides a more or less worst case example -- straightforward must make roughly half a million character comparisons to pass each 'c', while KMP does 2000 (+/-2) (one way through the array and back), so there are situations where KMP definitely is faster. But ordinarily, on my computer, your version of straightforward is 10-15% faster than KMP (search-patterns are now supplied on the command line -- which has no big impact; searched string is " able sea..."; all compiled with -O2; NOINLINE makes no difference -- at least with optimisations on --; without optimisations, KMP suffers far worse than straightforward).
KMP is O(m) while straightforward is O(m*n).
Where m is the length of the input and n is the length of the searched-for pattern, I think? But these are worst-case complexities, I believe, ordinarily, straightforward will be O(m), too.
your test may give better results with KMP algorithm just
because you repeat the same search many times and it was automatically "run-time compiled" as described on the wiki page about KMP
My results seem to witness against that.
My test favors straightforward, in any other case KMP wins by order of magnitude.
Can you give example tests?
I think that straightfoirward is better then KMP with optimisations off is due more complex program.
try to add
{-# NOINLINE replace #-}
to both programs and repeat comparision
Greetings, Bane.
Cheers, Daniel

Hello Daniel, Wednesday, December 21, 2005, 5:20:18 PM, you wrote:
BM> I've finally performed test on amd64 and result is a same as on intel. BM> KMP always wins. So KMP is best suited for non indexed strings BM> and I guess should be used in library as prefered search/replace method. BM> This test favors straightforward search.
i'm 90% sure that straightforward method must be faster for one-time searches.
DF> I'm far less sure of that. If you have a really short search-pattern, I think DF> that probably straightforward search is indeed faster sorry, i mean exactly that, answering proposition to add this to library as a PREFFERED method. imho, typical usage of these routines is for short enough patterns and searched strings -- Best regards, Bulat mailto:bulatz@HotPOP.com

From: Daniel Fischer
To: "Branimir Maksimovic" CC: Bulat Ziganshin , Haskell-Cafe@haskell.org KMP is O(m) while straightforward is O(m*n).
Where m is the length of the input and n is the length of the searched-for pattern, I think? Yes. But these are worst-case complexities, I believe, ordinarily, straightforward will be O(m), too.
Yes, those are worst cases for both algorithms. O(m) for KMP, O(m*n) for straightforward.
My test favors straightforward, in any other case KMP wins by order of magnitude. Can you give example tests?
Any example that has long search pattern say (many a's followed by b ) and searched string has many partial matches (many a's). Particularly, any example which exhibits O(m*n) or close to, case for straightforward search. Greetings, Bane. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

Am Mittwoch, 21. Dezember 2005 15:20 schrieb Daniel Fischer:
i'm 90% sure that straightforward method must be faster for one-time searches.
I'm far less sure of that. If you have a really short search-pattern, I think that probably straightforward search is indeed faster (at least, if the search-pattern isn't supplied at compile-time). But if you have a pattern of length 100, say, and lots of relatively long partial matches, chances are that KMP will move on something like 50 chars at once, which would have to be re-checked by straightforward. I've no idea, when that would outweigh the extra time of building the KMP-arrays, though. In my test -- extremely unrealistic, but provides a more or less worst case example -- straightforward must make roughly half a million character comparisons to pass each 'c', while KMP does 2000 (+/-2) (one way through the array and back), so there are situations where KMP definitely is faster. But ordinarily, on my computer, your version of straightforward is 10-15% faster than KMP (search-patterns are now supplied on the command line -- which has no big impact; searched string is " able sea..."; all compiled with -O2; NOINLINE makes no difference -- at least with optimisations on --; without optimisations, KMP suffers far worse than straightforward).
I've now tested with main = do args <- getArgs n <- case args of (ar:_) -> readIO ar `catch` (\_ -> return 400) _ -> return 500 let src = replicate n 'r' dst = " # " str = replicate (n-1) 'r' ++ 'c': replicate n 'r' k = if n < 200 then ceiling (5e6 / fromIntegral n) else ceiling (1e7 / fromIntegral n) out = searchReplace src dst $ concat $ replicate k str putStr "Working with " print n print $ length out putStrLn "Done" and KMP wins for n == 1 and n >= 12, also for " able seasea...", KMP wins for search-patterns of length 1 and patterns with no partial matches (and some others), but generally -- on my thing -- Bulat's version wins. Cheers, Daniel

Hello Branimir, Wednesday, December 21, 2005, 10:18:43 AM, you wrote:
try to add
{-# NOINLINE replace #-}
to both programs and repeat comparision
BM> These are tests: BM> No optimisations (no -O): NOINLINE just prevents RunTimeCompilation (see wiki page for details), so this way you will test speed of "replace" on previously unknown string. disabling optimization says nothing about real speed of optimized program, which searches for the many different strings -- Best regards, Bulat mailto:bulatz@HotPOP.com

From: Bulat Ziganshin
Reply-To: Bulat Ziganshin To: "Branimir Maksimovic" CC: haskell-cafe@haskell.org Subject: Re[2]: [Haskell-cafe] Substring replacements Date: Fri, 23 Dec 2005 11:32:01 +0300 Hello Branimir,
Wednesday, December 21, 2005, 10:18:43 AM, you wrote:
try to add
{-# NOINLINE replace #-}
to both programs and repeat comparision
BM> These are tests: BM> No optimisations (no -O):
NOINLINE just prevents RunTimeCompilation (see wiki page for details), so this way you will test speed of "replace" on previously unknown string. disabling optimization says nothing about real speed of optimized program, which searches for the many different strings
I got it. These tests were with NOINLINE in both cases but I didn;t saw any speed difference in results as actually replace (straight) and searchReplace (KMP) is just called for two differnet strings. Perhaps if I call that for long list of short patterns patterns on short string, test would display different results (INLINE wouldn't help). I'll try that next. Greetings, Bane. _________________________________________________________________ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/
participants (3)
-
Branimir Maksimovic
-
Bulat Ziganshin
-
Daniel Fischer