
The haystack and the shared copy are only a needle's-length apart. The
first stage traverses H and N until one of them runs out, holding a copy of
the beginning of each. This requires at most O(min{|H|, |N|}) additional
memory (the original needle and the needle-length prefix of the haystack).
Assuming we didn't run out of haystack before we ran out of needle (which
assumption seems generally likely), we proceed to the next step, traversing
H and (drop |N| H) together. During this phase we hold O(|N|) additional
memory: the needle itself and the needle-length portion of the longer of
the two haystack remnants we hold. Note that in the second phase, we
*don't* hold on to the beginning of the haystack, because we will never
need it again! Then in the final phase, when the shorter haystack remnant
has run out, we still have the same O(|N|) memory, which is now the needle
itself and the needle-length suffix of the haystack, and (==) traverses
them both together. Thus the total additional memory residency for this
algorithm is O(min {|H|, |N|}). Using the length approach requires that you
hold the beginning of the haystack for traversal while running length. To
put it another way, you force the entire spine of the haystack to run
length, and then start from the beginning. If the haystack is produced
lazily, this potentially requires O(|H|) extra memory. Since you also check
the length of the needle, your total additional memory comes to O(max {|H|,
|N|}). I apologize if my horrid variable names obscured what goes on in the
algorithm I described.
On Sun, Oct 12, 2014 at 10:14 AM, Andreas Abel
Well, you also traverse the haystack twice, in your getEndChunk function you simultaneously traverse the haystack and a (shared) copy of it. Why is this so much better?
I guess without data (timings, heap-profiles) we do not get further here.
On 11.10.2014 14:47, David Feuer wrote:
It can be, but your way traverses the entire haystack *twice*. The memory needs are equivalent to the reversing version, which I consider unacceptable.
I do not understand this comment. reverse needs O(n) memory, length O(1).
Cheers, Andreas
On Sat, Oct 11, 2014 at 5:57 AM, Andreas Abel
mailto:abela@chalmers.se> wrote:
David, the essence of your idea seems mutually drop the correct number of elements from needle and hay and then compare for equality. Here is a concise implementation of your idea in terms of drop:
isSuffixOf :: forall a . (Eq a) => [a] -> [a] -> Bool [] `isSuffixOf` _ = True xs `isSuffixOf` ys = xs == drop (length (drop (length xs) ys)) ys
This can be further simplified to
isSuffixOf :: forall a . (Eq a) => [a] -> [a] -> Bool [] `isSuffixOf` _ = True xs `isSuffixOf` ys = xs == drop (length ys - length xs) ys
which is a very easy to understand program, I think, without need to reverse lists.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Andreas Abel <>< Du bist der geliebte Mensch.
Department of Computer Science and Engineering Chalmers and Gothenburg University, Sweden
andreas.abel@gu.se http://www2.tcs.ifi.lmu.de/~abel/