
From: Daniel Fischer
To: "Branimir Maksimovic" CC: Haskell-Cafe@haskell.org Subject: Re: [Haskell-cafe] Substring replacements Date: Tue, 13 Dec 2005 11:23:29 +0100
After seeing that your program is fastest (I've also tried one from http://haskell.org/hawiki/RunTimeCompilation but perhaps I'm not that good in converting to search replace?) I've decided to try with Rabin-Karp algorithm. This algorithm performs same operation as straightforward search, but compares hashes instead of chars. With ability to rotate hash (remove first, add next) characters there is also optimisation, that hash is calculated only for single next character rather again for whole substring. Unfortunatelly on my machine it is very cheap to compare characters so with my test hashing overweights character compare, except in your test when hash searching is faster then straightforward search. This is best I can write in terms of performance and readability. I've tried with getFst that returns Maybe but it was slower so I decided to return '\0' in case that argument is empty list, which renders '\0' unusable, but then I really doubt that 0 will be used in strings. -- Rabin-Karp string search algorithm, it is very effective in searching of set -- of patterns of length n on same string -- this program is for single pattern search, but can be crafted -- for multiple patterns of length m hSearchReplace :: String -> String -> String -> String hSearchReplace sr rp xs | not (null remaining) = found ++ rp ++ hSearchReplace sr rp (drop (length sr) remaining) | otherwise = found where (found,remaining) = hSearch sr xs hSearch :: String -> String -> (String,String) hSearch sr xs = hSearch' sr xs hcmp "" where hsrch = hash sr hcmp = hash $ take ls xs cmp = take ls xs ls = length sr hSearch' [] xs _ _= (xs,[]) hSearch' sr [] _ fndFail = (reverse fndFail,[]) hSearch' srch xxs@(x:xs) hcmps fndFail = if hsrch == hcmps then if isPrefixOf srch xxs then (reverse fndFail,xxs) else searchAgain else searchAgain where searchAgain = hSearch' srch xs (hashRotate (getFst xxs) (getFst nextxxs) (ls-1) hcmps) (x:fndFail) nextxxs = drop ls xxs getFst :: String -> Char getFst [] = '\0' getFst (a:as) = a hash :: String -> Int hash str = hash' str (length str - 1) where hash' :: String -> Int -> Int hash' [] _ = 0 hash' (s:str) pow = (101 ^ pow) *(fromEnum s) + hash' str (pow-1) hashRotate :: Char -> Char -> Int -> Int -> Int hashRotate cout cin pow hsh = (hsh - ((101 ^ pow) * (fromEnum cout)))*101 + (fromEnum cin) Greetings, Bane. _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/