The haystack and the shared copy are only a needle's-length apart. The first stage traverses H and N until one of them runs out, holding a copy of the beginning  of each. This requires at most O(min{|H|, |N|}) additional memory (the original needle and the needle-length prefix of the haystack). Assuming we didn't run out of haystack before we ran out of needle (which assumption seems generally likely), we proceed to the next step, traversing H and (drop |N| H) together. During this phase we hold O(|N|) additional memory: the needle itself and the needle-length portion of the longer of the two haystack remnants we hold. Note that in the second phase, we *don't* hold on to the beginning of the haystack, because we will never need it again! Then in the final phase, when the shorter haystack remnant has run out, we still have the same O(|N|) memory, which is now the needle itself and the needle-length suffix of the haystack, and (==) traverses them both together. Thus the total additional memory residency for this algorithm is O(min {|H|, |N|}). Using the length approach requires that you hold the beginning of the haystack for traversal while running length. To put it another way, you force the entire spine of the haystack to run length, and then start from the beginning. If the haystack is produced lazily, this potentially requires O(|H|) extra memory. Since you also check the length of the needle, your total additional memory comes to O(max {|H|, |N|}). I apologize if my horrid variable names obscured what goes on in the algorithm I described.

On Sun, Oct 12, 2014 at 10:14 AM, Andreas Abel <abela@chalmers.se> wrote:
Well, you also traverse the haystack twice, in your getEndChunk function you simultaneously traverse the haystack and a (shared) copy of it.  Why is this so much better?

I guess without data (timings, heap-profiles) we do not get further here.

On 11.10.2014 14:47, David Feuer wrote:
It can be, but your way traverses the entire haystack *twice*. The
memory needs are equivalent to the reversing version, which I consider
unacceptable.

I do not understand this comment.  reverse needs O(n) memory, length O(1).

Cheers,
Andreas

On Sat, Oct 11, 2014 at 5:57 AM, Andreas Abel <abela@chalmers.se
<mailto:abela@chalmers.se>> wrote:

    David, the essence of your idea seems mutually drop the correct
    number of elements from needle and hay and then compare for
    equality.  Here is a concise implementation of your idea in terms of
    drop:

    isSuffixOf        :: forall a . (Eq a) => [a] -> [a] -> Bool
    [] `isSuffixOf` _  =  True
    xs `isSuffixOf` ys =  xs == drop (length (drop (length xs) ys)) ys

    This can be further simplified to

    isSuffixOf        :: forall a . (Eq a) => [a] -> [a] -> Bool
    [] `isSuffixOf` _  =  True
    xs `isSuffixOf` ys =  xs == drop (length ys - length xs) ys

    which is a very easy to understand program, I think, without need to
    reverse lists.



_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries



--
Andreas Abel  <><      Du bist der geliebte Mensch.

Department of Computer Science and Engineering
Chalmers and Gothenburg University, Sweden

andreas.abel@gu.se
http://www2.tcs.ifi.lmu.de/~abel/