
On Sun, 2010-04-11 at 18:43 +0200, Maciej Piechotka wrote:
* Build reporting in the hackage server The idea here is that cabal sends back anonymous reports to the server to say if a package compiled or not, and against what versions of dependencies. This would make it clearer to maintainers if their dependency versions are correct. Additionally I think it would be useful for hackage to provide tweaked .cabal files for packages with updated constraints, even without new version uploads. We are proposing a GSoC project which would cover some of this.
Hmm. I guess there are 2 problems:
- Privacy problem. I don't want the software to call home with data without asking.
Obviously it is important that the data be anonymous and that we do not send stuff without the user's knowledge. While there is not any directly identifying information in the existing anonymous build reports, one has to be very careful with how much access the server provides to the reports or it may become possible to infer identifying information. For example you might happen to know that I'm one of the few users on Sparc/Linux. Another possible attack is to try to discover timing of uploaded reports by the attacker inserting their own reports at regular intervals. If the server provided direct access to reports, in order of upload time, then this would give away timing of uploads. Similarly, if all reports were kept together then one could identify bunches of packages uploaded at the same time and infer they are from the same user. So yes, it needs careful thought. Perhaps we cannot give clients access to the raw reports at all. Perhaps it's enough to separate reports by package and to delay publishing new ones for a day or two, and to randomise the order when they are made available.
- Spoofing problem. Someone can feed wrong data easily without chance of being discovered (for example to prove how grate 'haxor' he is).
I don't think that's a major problem. One has to treat the data with that possibility in mind. There's also the option for authenticated non-anoymous reports with build logs. Duncan