[Haskell-cafe] Re: cryptographic hash functions in darcs (re: announcing darcs 2.0.0pre3)

24 Jan 2008

      following-up to my own post:

On Jan 24, 2008, at 1:15 PM, zooko wrote:
...
So, let me see if I understand the issues here.
...
2.  It would be nice, but isn't currently used, if one could rely  
on the property that for a given patch id, nobody can come up with  
another patch that has the same id.  This would be nice because in  
the future we might use this to securely identify patches and  
entire repositories.  This property is called "second pre-image  
resistance".
3.  Likewise, it would be nice if it were impossible for someone to  
come up with two different patches that had the same patch id.   
This is called "collision resistance".
Actually, these properties *are* currently relied upon, whenever a  
darcs process uses a patch id without seeing the patch from which it  
was generated.

For example, if you have a patch which you received in e-mail, and  
the patch contains the ids of patches on which it depends but not the  
complete patches, and your darcs process matches up those ids with  
patches that it has locally, then you are relying on these two  
properties.  (The same thing happens when pushing and pulling patches  
over HTTP or SSH.)

If your patch-id-generation function doesn't have collision  
resistance, then you are vulnerable to the possibility that the  
original creator of the patch id (not necessarily the person who sent  
you this patch in e-mail) also created another patch which matched  
the same id.  Having done so, this person could subvert your  
repository, i.e. insert backdoors into the source code managed by  
your repository.

If your patch-id-generation function doesn't have second-pre-image  
resistance, then you are vulnerable to the possibility that someone  
else (not necessarily the original creator of the patch id) also  
created another patch which matched the same id.  Having done so,  
this person could subvert your repository.

This makes the choice of SHA-1 for the patch-id-generation function  
wholly inappropriate.  We already know that SHA-1 doesn't have  
collision resistance, and there is reason to suspect that in the near  
future it will turn out that it doesn't have second-pre-image  
resistance either.

The basic choice is: (a) use an insecure function and simply state  
that anyone with whom you (transitively) exchange patches has the  
opportunity to subvert your repository, or (b) use a secure hash  
function, i.e. SHA-256 or Tiger.

Regards,

Zooko