
I'm cc'ing haskell-cafe on this darcs bug, since I imagine there might be a student (or professor) hanging around there who might be interested in the implementation of an efficient LCS algorithm in haskell. [markjugg - Sun Feb 20 09:15:57 2005]:
Typically in software, 1000's of line aren't changed at the time, which is why this hasn't been an big issue before.
Indeed. The issue here is that darcs uses the exact Hunt-Szymanski LCS algorithm when it computes the diff, while GNU diff uses an approximate --but faster--algorithm, which works "almost" as well. The approximate algorithm is reasonably well-documented, but more complex to implement than the exact one, just because it's harder to understand. This would be a reasonable project for an interested haskell CS student. The LCS is implemented in pure haskell 98, and is of signature lcs :: Ord a => [a] -> [a] -> [a] It's an interesting problem that's easily testable, and has different scaling behavior in different limits (finite alphabet, infinite alphabet, permutations of unique objects). David (ever optimistic for new contributors to darcs!) http://darcs.net