
I am still working on improving your code. And I have a "Is this a bug?" question: The "lookup = computeLookup pat" defines lookup to take an Int which represents the index into pat, where this index is 0 based and the 0th item is the head of pat. Now look at "matchLength :: Int; matchLength = let ... in matchLength' (B.drop (fromIntegral patShift) pat) str 0" That starts the "pat" from a shorter version that starts at 'patShift' and sets the last argument to matchLength' to 0. matchLength' is in the ... as: let matchLength' pat str cnt | B.null pat = cnt matchLength' pat str cnt | B.null str = 0 matchLength' pat str cnt | (B.head pat == B.head str) = matchLength' (B.drop 1 pat) (B.drop 1 str) (cnt + 1) | otherwise = cnt I see that cnt is incremented until * all of pat is matched, return cnt * all of str is used, return 0 * a mismatch is found, return cnt So a mismatch means that the character at index (patShift+cnt) in the original 'pat' had the mismatch. Now I see this that uses the 'cnt' returned from matchLength: shiftAmt = {-# SCC "kmpMatch_shiftAmt" #-} lookup matchLength And this is the bug. lookup should be given (shiftAmt + matchLength). This was not clear or obvious since * Nearly everything is Int or Int64 * The name "pat" in matchLength' shadows "pat" in kmpMatch This was not caught by quickcheck since it is an internal value and only rarely will trigger an error where a legitimate match is missed. If there is no match then this bug does not cause an incorrect answer (I think). I will fix this and continue hacking on it. -- Chris