
Hi, Simple question: I need a function that matches string in another string. Something like: find (isSuffixOf "needle") (inits "haystack") This one is beautiful, but not very practical. Could anybody point me to some haskell library that does some searching, using KMP for example? -- Gracjan

Gracjan Polak
find (isSuffixOf "needle") (inits "haystack")
Hmm... While the result isn't exactly the same, I suspect using isPrefixOf and tails would be more efficient.
This one is beautiful, but not very practical.
Unless you have very repetitive data and/or tiny alphabet, it is actually quite efficient, as the expected length of prefixes that need to be checked before a mismatch can be determined is small. At least, I was unable to beat it with my (feeble attempts at) BM or KMP implementations. -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde wrote:
Gracjan Polak
writes: find (isSuffixOf "needle") (inits "haystack")
Hmm...
While the result isn't exactly the same, I suspect using isPrefixOf and tails would be more efficient.
I need the data before and including my needle. Like this: ( ... needle ) ignored Or at least count of the first part. Or, best, pair of: (beforeandincluding,after). String is rather long (potentially infinite), so using reverse and tail could be a problem :)
This one is beautiful, but not very practical.
Unless you have very repetitive data and/or tiny alphabet, it is actually quite efficient, as the expected length of prefixes that need to be checked before a mismatch can be determined is small.
At least, I was unable to beat it with my (feeble attempts at) BM or KMP implementations.
-kzm

On 2005 May 16 Monday 08:00, Gracjan Polak wrote:
Ketil Malde wrote:
While the result isn't exactly the same, I suspect using isPrefixOf and tails would be more efficient.
I need the data before and including my needle.
When the haystack gets large, the beautiful find (isSuffixOf "needle") (inits "haystack") is quite inefficient where it uses isSuffixOf searching longer and longer strings. You can get efficiency, the desired data, and deal with infinite strings by using a function that is like 'inits' but which returns the initial strings in reversed order. reversed_inits = scanl (flip (:)) "" find (isPrefixOf (reverse "needle")) (reversed_inits "haystack")

You can get efficiency, the desired data, and deal with infinite strings by using a function that is like 'inits' but which returns the initial strings in reversed order.
reversed_inits = scanl (flip (:)) "" find (isPrefixOf (reverse "needle")) (reversed_inits "haystack")
If I may ask for curiosity's sake, how much string data are we talking about here? Is it practical to process a serious volume of data as [Char]? Donn Cave, donn@drizzle.com

On 2005 May 17 Tuesday 11:44, Donn Cave wrote:
You can get efficiency, the desired data, and deal with infinite strings. reversed_inits = scanl (flip (:)) "" find (isPrefixOf (reverse "needle")) (reversed_inits "haystack")
With "get efficiency", I was comparing this program which is linear time and constant space in the amount of the haystack searched, to an earlier suggestion which was quadratic time and linear space.
Is it practical to process a serious volume of data as [Char]?
As for your question, GHC _can_ handle a serious volume of [Char]. I don't know how competitive the efficiency is.

On 2005 May 16 Monday 08:00, Gracjan Polak wrote:
Ketil Malde wrote:
While the result isn't exactly the same, I suspect using isPrefixOf and tails would be more efficient.
I need the data before and including my needle.
When the haystack gets large, the beautiful find (isSuffixOf "needle") (inits "haystack") is quite inefficient where it uses isSuffixOf searching longer and longer strings. You can get efficiency, the desired data, and deal with infinite strings by using a function that is like 'inits' but which returns the initial strings reversed. reversed_inits = scanl (flip (:)) "" find (isPrefixOf (reverse "needle")) (reversed_inits "haystack")

Hello Gracjan, Monday, May 16, 2005, 4:00:33 PM, you wrote: GP> > Unless you have very repetitive data and/or tiny alphabet, it is GP> > actually quite efficient, as the expected length of prefixes that need GP> > to be checked before a mismatch can be determined is small. GP> > GP> > At least, I was unable to beat it with my (feeble attempts at) BM or GP> > KMP implementations. if you really need KMP, you can find it at http://haskell.org/hawiki/RunTimeCompilation
find (isSuffixOf "needle") (inits "haystack")
find (isPrefixOf "needle") (tails "haystack") if you need an index - add it with zip: find (isPrefixOf "needle".snd) (zip [0..] (tails "haystack")) -- Best regards, Bulat mailto:bulatz@HotPOP.com

At 10:27 16/05/05 +0200, Gracjan Polak wrote:
Hi,
Simple question: I need a function that matches string in another string. Something like:
find (isSuffixOf "needle") (inits "haystack")
This one is beautiful, but not very practical. Could anybody point me to some haskell library that does some searching, using KMP for example?
[[ import List foo = isPrefixOf (reverse "needle") (reverse "haystack with needle") bar = isPrefixOf (reverse "needle") (reverse "haystack with pins") ]] Seems to work. And (by inspection) is linear in size of haystack. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

G'day all.
Quoting Gracjan Polak
Simple question: I need a function that matches string in another string. Something like:
find (isSuffixOf "needle") (inits "haystack")
This one is beautiful, but not very practical.
This one is fairly practical, but not very beautiful: http://haskell.org/hawiki/RunTimeCompilation Cheers, Andrew Bromage
participants (8)
-
ajb@spamcop.net
-
Bulat Ziganshin
-
Donn Cave
-
Gracjan Polak
-
Graham Klyne
-
Ketil Malde
-
Scott & Kathy
-
Scott Turner