diff implementation in haskell

Are there pure haskell implementations of diff and diff3 algorithms. Like:
data Diff a = Deleted [a] | Common [a] | Added [a]
diff :: Eq a => [a] -> [a] -> [Diff a]
diff3 :: Eq a => [a] -> [a] -> [a] -> [a]
or smth more general like:
class Diff a where type Difference a diff :: a -> a -> Difference a patch :: a -> Difference a -> a
prop_diffpatch a b = patch a (diff a b) == b
? If not, where can I start towards the implementation? What's the way to do it more functionally? -- Victor Nazarov

asviraspossible:
Are there pure haskell implementations of diff and diff3 algorithms. Like:
data Diff a = Deleted [a] | Common [a] | Added [a]
diff :: Eq a => [a] -> [a] -> [Diff a]
diff3 :: Eq a => [a] -> [a] -> [a] -> [a]
There's at least one on Hackage: http://hackage.haskell.org/package/Diff -- Donn

Don Stewart
Are there pure haskell implementations of diff and diff3 algorithms?
Wherein we can read: | This is an implementation of the O(ND) diff algorithm [...]. It is O(mn) | in space. At first I thought 'N' and 'M' would be the lengths of the lists, and 'D' is the number of differences, but then the space bound doesn't make sense. I tried to find the reference, but Citeseer is down (again. sigh). Anybody know what the bounds really are? -k -- If I haven't seen further, it is by standing in the footprints of giants

ketil:
Don Stewart
writes: Are there pure haskell implementations of diff and diff3 algorithms?
Wherein we can read:
| This is an implementation of the O(ND) diff algorithm [...]. It is O(mn) | in space.
At first I thought 'N' and 'M' would be the lengths of the lists, and 'D' is the number of differences, but then the space bound doesn't make sense. I tried to find the reference, but Citeseer is down (again. sigh).
Anybody know what the bounds really are?
This looks like the paper, http://www.xmailserver.org/diff2.pdf Page 2, "The algorithm can be refined to use linear space", N and M appear to be the length of the sequences, D is the size of the minimum edit script. -- Don

Don Stewart wrote:
This looks like the paper, http://www.xmailserver.org/diff2.pdf
Page 2, "The algorithm can be refined to use linear space", N and M appear to be the length of the sequences, D is the size of the minimum edit script.
T'would be lovely to have that in the docs for the package :-). Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of Ketil Malde
Don Stewart
writes: Wherein we can read:
| This is an implementation of the O(ND) diff algorithm [...]. It is O(mn) | in space.
At first I thought 'N' and 'M' would be the lengths of the lists, and 'D' is the number of differences, but then the space bound doesn't make sense. I tried to find the reference, but Citeseer is down (again. sigh).
Anybody know what the bounds really are?
I think the space bounds are O(M+N). Section "4b. A Linear Space Refinement" concludes with "Unfortunately, the input sequences A and B must be kept in memory, implying that a total of O(M+N) space is needed." BTW, there is a minor improvement to this algorithm (claims to be faster, same space usage): http://research.janelia.org/myers/Papers/np_diff.pdf found via: http://scholar.google.com/scholar?q=%22An+O(NP)+Sequence+Comparison+Algo rithm.%22 I thought this paper was easier to understand. Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. *****************************************************************

Also, there is a paper about doing a type-safe diff in Agda, http://portal.acm.org/citation.cfm?id=1596614.1596624 I heard rumors that the library will be ported to Haskell. -chris On 8 dec 2009, at 15:20, Bayley, Alistair wrote:
From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of Ketil Malde
Don Stewart
writes: Wherein we can read:
| This is an implementation of the O(ND) diff algorithm [...]. It is O(mn) | in space.
At first I thought 'N' and 'M' would be the lengths of the lists, and 'D' is the number of differences, but then the space bound doesn't make sense. I tried to find the reference, but Citeseer is down (again. sigh).
Anybody know what the bounds really are?
I think the space bounds are O(M+N). Section "4b. A Linear Space Refinement" concludes with "Unfortunately, the input sequences A and B must be kept in memory, implying that a total of O(M+N) space is needed."
BTW, there is a minor improvement to this algorithm (claims to be faster, same space usage): http://research.janelia.org/myers/Papers/np_diff.pdf
found via:
http://scholar.google.com/scholar?q=%22An+O(NP)+Sequence+Comparison+Algo rithm.%22
I thought this paper was easier to understand.
Alistair ***************************************************************** Confidentiality Note: The information contained in this message, and any attachments, may contain confidential and/or privileged material. It is intended solely for the person(s) or entity to which it is addressed. Any review, retransmission, dissemination, or taking of any action in reliance upon this information by persons or entities other than the intended recipient(s) is prohibited. If you received this in error, please contact the sender and delete the material from any computer. *****************************************************************
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Wed, Dec 9, 2009 at 13:41, Chris Eidhof wrote:
Also, there is a paper about doing a type-safe diff in Agda, http://portal.acm.org/citation.cfm?id=1596614.1596624
Surprisingly, the paper also discusses a comparable implementation in Haskell. I heard rumors that the library will be ported to Haskell.
Yes, I heard that this library will be released at some point. Wonder how that's going... Regards, Sean

Chris Eidhof wrote:
Also, there is a paper about doing a type-safe diff in Agda, http://portal.acm.org/citation.cfm?id=1596614.1596624
That is locke dbehind some ridiculous paywall. It seems the same paper is available here: http://people.cs.uu.nl/andres/GDiff.html Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/
participants (7)
-
Bayley, Alistair
-
Chris Eidhof
-
Don Stewart
-
Erik de Castro Lopo
-
Ketil Malde
-
Sean Leather
-
Victor Nazarov