
With Hackage down, now seemed like a good time to push this issue again. It's such an important site to us that it's really rather a shame there are no mirrors of it. I have a personal-and-business server in a data center in Newark, with a fair chunk of bandwidth, which I'd like to offer for a permanent mirror. Is there interest in this? Who do I need to talk to for it to happen? Strategy-wise, I think the best approach is round-robin DNS, since that's transparent to the end user - everything would still appear at the URL it's at now, but behind-the-scenes magic would let things keep working when one or the other site is down. I haven't personally set up such a system before but I'm willing to take on the burden of figuring it out. So I have a better idea of what I'm signing up for, can anyone tell me how much disk space and how much bandwidth per month Hackage uses? I have a fair chunk of both, as I say, but I'd like to know in advance to ensure that things go smoothly. As for what I'd want in return for this, really nothing. I wouldn't say no to an unobtrusive mention somewhere on the site, but I'd be happy just knowing I'd given something back to the Haskell community, which has given a lot to me. -- Dan Knapp "An infallible method of conciliating a tiger is to allow oneself to be devoured." (Konrad Adenauer)

I am no decision maker regarding Hackage, but I would like to echo my support for this offer. Hackage is a vital part of my workflow, and I'm sure I'm not the only one. Its importance to the Haskell community has grown quickly and is continuing to do so. Each time it goes down, the impact is larger than before. We should have a mirror in place for situations like these. - Jake

This is a very generous offer. However, I must say I like the following idea
more:
http://www.reddit.com/r/haskell/comments/efw38/reminder_hackagehaskellorg_ou...
On 4 December 2010 16:31, Dan Knapp
With Hackage down, now seemed like a good time to push this issue again. It's such an important site to us that it's really rather a shame there are no mirrors of it. I have a personal-and-business server in a data center in Newark, with a fair chunk of bandwidth, which I'd like to offer for a permanent mirror. Is there interest in this? Who do I need to talk to for it to happen?
Strategy-wise, I think the best approach is round-robin DNS, since that's transparent to the end user - everything would still appear at the URL it's at now, but behind-the-scenes magic would let things keep working when one or the other site is down. I haven't personally set up such a system before but I'm willing to take on the burden of figuring it out.
So I have a better idea of what I'm signing up for, can anyone tell me how much disk space and how much bandwidth per month Hackage uses? I have a fair chunk of both, as I say, but I'd like to know in advance to ensure that things go smoothly.
As for what I'd want in return for this, really nothing. I wouldn't say no to an unobtrusive mention somewhere on the site, but I'd be happy just knowing I'd given something back to the Haskell community, which has given a lot to me.
-- Dan Knapp "An infallible method of conciliating a tiger is to allow oneself to be devoured." (Konrad Adenauer)
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ozgur Akgun

Ozgur Akgun
This is a very generous offer. However, I must say I like the following idea more:
http://www.reddit.com/r/haskell/comments/efw38/ reminder_hackagehaskellorg_outage_tomorrow_due_to/c17u7nk
I'd support this, but I'm strongly in favor of the use of WePay.com over PayPal for collecting funds. The former's shady history is a matter of public record. Alternatively, I'd also be willing to throw some of my own bandwidth and disk space towards a mirror. -=rsw

On 12/4/10 2:21 PM, Riad S. Wahby wrote:
Ozgur Akgun
wrote: This is a very generous offer. However, I must say I like the following idea more:
http://www.reddit.com/r/haskell/comments/efw38/ reminder_hackagehaskellorg_outage_tomorrow_due_to/c17u7nk
That sounds like a great idea. The 501(c)(3) I mean. The distributed hosting is nice too, though I'd like to see the 501(c)(3) formed before donating, just so we can get official reports and all that. But once that's up, I'm definitely willing to contribute. FWIW, I've been on the board of directors for a 501(c)(3), helped write their bylaws, and know a few people in the business (lawyers, etc). I'm willing to offer advice, effort, and references whenever the committee decides to do this.
I'd support this, but I'm strongly in favor of the use of WePay.com over PayPal for collecting funds. The former's shady history is a matter of public record.
Semantic Parse Fail: did you mean "the latter" or "strongly opposed to"?
Alternatively, I'd also be willing to throw some of my own bandwidth and disk space towards a mirror.
For a scalable solution I think we should be aiming for both (1) a distributed "central" server, and (2) mirrors around the globe. The former gives reliability to the main site, but the latter help for locality and loadbalancing as well as failover. -- Live well, ~wren

Why is there even any consideration of some committee if someone wants to mirror the Hackage site? Why not mirror the site?

On 5 December 2010 18:41, Florian Lengyel
Why is there even any consideration of some committee if someone wants to mirror the Hackage site? Why not mirror the site?
Presumably to make it an official mirror, and possibly due to the licenses of some content on there. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Florian Lengyel
Why is there even any consideration of some committee if someone wants to mirror the Hackage site? Why not mirror the site?
+1 Alright, Mr. Wiseguy," she said, "if you're so clever, you tell us what colour it should be." We can either let Dan set up a mirror, and add it to the haskell.org DNS (or just let it live at a different address), and have a mirror up in a couple of hours -- or we can set up 501(c)(3)s (whatever they are), decide on a payment service, write bylaws, hire lawyers etc. All in the name of "not-dealing-with-this-shit"¹? In my experience, everything is a lot simpler if you can avoid dealing directly with money. The current problem is that hackage has sometimes been unstable, having a mirror would fix or at least alleviate that. -k ¹ As argued in the cited Reddit thread. Okay, in all fairness, the "shit" being referred to is the current non-redundant DNS configuration, not people volunteering to solve technical issues. :-) -- If I haven't seen further, it is by standing in the footprints of giants

On 12/5/10 11:23 AM, Ketil Malde wrote:
Florian Lengyel
writes: Why is there even any consideration of some committee if someone wants to mirror the Hackage site? Why not mirror the site?
+1
Alright, Mr. Wiseguy," she said, "if you're so clever, you tell us what colour it should be."
We can either let Dan set up a mirror, and add it to the haskell.org DNS (or just let it live at a different address), and have a mirror up in a couple of hours -- or we can set up 501(c)(3)s (whatever they are), decide on a payment service, write bylaws, hire lawyers etc. All in the name of "not-dealing-with-this-shit"¹?
"Or"? There's no need to be exclusive. Fact is that haskell.org has some money and needs to deal with that, so incorporating is just a matter of time. But that doesn't preclude folks setting up mirrors or doing anything non-money related. All one needs to do is convince the DNS owners to add your IP# to their entry. 501(c)(3) is the legal term for a class of US organizations more typically known as "non-profit organizations". By incorporating as a 501(c)(3) you get the benefits and responsibilities of being a government-recognized organization (e.g., rights to use a company name and prevent others from using it, certain kinds of indemnification against legal action, ability to act as a legal entity in other ways, rights to collect money, responsibility to pay taxes,...) -- Live well, ~wren

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/5/10 02:41 , Florian Lengyel wrote:
Why is there even any consideration of some committee if someone wants to mirror the Hackage site? Why not mirror the site?
Because it would be nice to have a mirror run by someone (a) accountable (b) who is unlikely to suddenly disappear due to loss of job, life becoming hectic, etc. (Consider that this is pretty much why *.haskell.org has been unreliable and fixes have been slow in coming; the individual in question is at Yale, and a good person but kinda snowed under of late.) - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkz7wrAACgkQIn7hlCsL25VrvACZAWZq4rYMM8PARZYvyFmnt1qZ jX4An3fgSSsuFLHR0/HsEB8hEeyj4MCO =MI9c -----END PGP SIGNATURE-----

On 12/4/10 10:34 PM, wren ng thornton wrote:
FWIW, I've been on the board of directors for a 501(c)(3), helped write their bylaws, and know a few people in the business (lawyers, etc). I'm willing to offer advice, effort, and references whenever the committee decides to do this.
I tried cc-ing my previous to committee@haskell.org but got

On 12/4/10 11:31 AM, Dan Knapp wrote:
With Hackage down, now seemed like a good time to push this issue again. It's such an important site to us that it's really rather a shame there are no mirrors of it. I have a personal-and-business server in a data center in Newark, with a fair chunk of bandwidth, which I'd like to offer for a permanent mirror.
Is there interest in this?
Absolutely.
Who do I need to talk to for it to happen?
I'd guess that'd be the haskell.org steering committee: http://haskellorg.wordpress.com/2010/11/15/the-haskell-org-committee-has-for... -- Live well, ~wren

I would really like mirrors too. But before that happens it would be nice to have signed packages on Hackage, preventing a mirror to distribute compromised stuff (intentionally or unintentionally). -- Vincent

On 12/6/10 2:35 AM, Vincent Hanquez wrote:
I would really like mirrors too.
But before that happens it would be nice to have signed packages on Hackage, preventing a mirror to distribute compromised stuff (intentionally or unintentionally).
+1. This should be done during sdist, before uploading, so that maintainers can be sure that the central mirror gets the right thing too. -- Live well, ~wren

Wow, this thread got long. Good! I'm hopeful that we can take some action now. :) My views on the issues that have been raised - The Haskell steering committee is a good thing and I fully support them. I also support the current maintainer of the site; I don't want to take over or anything, only to assist. In fact, I'll go further, please don't anybody attempt to foist any high-level responsibility on me. I'm a bad receptacle for it. But I do have these technological resources at my disposal and there's no reason the community shouldn't benefit from them. Re incorporation, the person who said that it has to happen was dead-on. So the rest of the discussion on that point is moot. But it's quite independent of when and how we set up mirroring. I agree that signed packages are a good idea. We should move with all haste to implement them. But I'm not sure we want to hold up everything else while we wait for that. That's also my take on a peer-peer repository, as I said already. Can somebody who understands the technologies typically used for this suggest one, and possibly also talk to dcoutts directly to make him aware of the discussion and get his thoughts on how to implement it? I've found he often makes points that save me a lot of work. :) I can certainly conceive of life events that could take my attention, despite all good intentions, in much the fashion that the current maintainer's often is. (That's awkward to say - what's his name, again? I know I should know it... It's not dcoutts, is it?) So I want to build something that works well with minimal manual intervention. I was of the impression that most of the members of the steering committee were on this list, which is one reason I posted here. Is there some other way I should contact them? I will talk to dcoutts, and see what the current status of the distributed-operation code is and figure out how much time I can devote to helping with that. -- Dan Knapp "An infallible method of conciliating a tiger is to allow oneself to be devoured." (Konrad Adenauer)

Dan Knapp
I agree that signed packages are a good idea. We should move with all haste to implement them. But I'm not sure we want to hold up everything else while we wait for that.
IMO, mirroring is orthogonal to that, too.
That's also my take on a peer-peer repository, as I said already.
Do you mean a two-way sync here? I think it'd be way easier to just set up a slave repo using rsync, and let people edit their .cabal/configs. But I don't really know the internals, perhaps there are implementation details of cabal or hackage that complicates this? -k -- If I haven't seen further, it is by standing in the footprints of giants

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/7/10 08:07 , Ketil Malde wrote:
Dan Knapp
writes: I agree that signed packages are a good idea. We should move with all haste to implement them. But I'm not sure we want to hold up everything else while we wait for that.
IMO, mirroring is orthogonal to that, too.
Only if you consider security a minor or non-issue. I'm tempted to say anyone who believes that on the modern Internet is at best naïve. (Although admittedly security is one of my work foci.) - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkz+k3YACgkQIn7hlCsL25W7PACdHUuh5zaPZeBTprMvN+HcLslu VV0AoJVgmDbBZyZtcX57fGWkGeW2dT/3 =Gqlm -----END PGP SIGNATURE-----

Brandon S Allbery KF8NH
IMO, mirroring is orthogonal to that, too.
Only if you consider security a minor or non-issue.
What I mean is that you can mirror a repository regardless of whether packages are signed or not.
I'm tempted to say anyone who believes that on the modern Internet is at best naïve.
It's not obvious to me that adding a mirror makes the infrastructure more more insecure. Any particular concerns? (I hope I qualify as naïve here :-) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Tue, Dec 07, 2010 at 11:04:04PM +0100, Ketil Malde wrote:
It's not obvious to me that adding a mirror makes the infrastructure more more insecure. Any particular concerns? (I hope I qualify as naïve here :-)
If you run a mirror people will come to you for software to run on their machines. I see a way to take advantage of that immediately.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/7/10 18:53 , Darrin Chandler wrote:
On Tue, Dec 07, 2010 at 11:04:04PM +0100, Ketil Malde wrote:
It's not obvious to me that adding a mirror makes the infrastructure more more insecure. Any particular concerns? (I hope I qualify as naïve here :-)
If you run a mirror people will come to you for software to run on their machines. I see a way to take advantage of that immediately.
Exactly. And this isn't theoretical; fake packages and packages with extra payloads injected into them are fairly common. - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkz/AMYACgkQIn7hlCsL25WCuwCgyuhbb6Q1eMbatUX5mxDp6Avi dDoAnj49sj73cDTVp0+8BXxi6oir3zAq =x2Gr -----END PGP SIGNATURE-----

Darrin Chandler
It's not obvious to me that adding a mirror makes the infrastructure more more insecure. Any particular concerns? (I hope I qualify as naïve here :-)
If you run a mirror people will come to you for software to run on their machines. I see a way to take advantage of that immediately.
My apologies for not expressing myself more clearly. What I mean is that currently, Hackage has a ton of users, each of whom may at whim upload a new version of any library. It's not clear to me that security is significantly worsened by adding a mirror. Assume I am out with ill intent: I can now either a) set up a mirror, replace some central library with my evil trojan, launch a DOS attack against hackage.haskell.org to get users to switch, and gloat in my secret castle as I await the fruits of my cunning schemes -- or I can b) just upload my trojan library to hackage directly. http://flaam.org/~jont/humor/uke48/Friends_of_Irony/image007.jpg -k -- If I haven't seen further, it is by standing in the footprints of giants

On 08/12/10 08:13, Ketil Malde wrote:
My apologies for not expressing myself more clearly. What I mean is that currently, Hackage has a ton of users, each of whom may at whim upload a new version of any library. It's not clear to me that security is significantly worsened by adding a mirror.
Assume I am out with ill intent: I can now either a) set up a mirror, replace some central library with my evil trojan, launch a DOS attack against hackage.haskell.org to get users to switch, and gloat in my secret castle as I await the fruits of my cunning schemes -- or I can b) just upload my trojan library to hackage directly. You have to start somewhere with security.
I think that an uploaded trojan library would be at least detectable as such, since the uploading user would have change (i'm not sure that what you had in mind ?). Whereas on a mirror, it would be completely transparent to the users. As a first step, having the hackage server and its users trusted, is hopefully reasonable. And then you can build up from there. This would be nice to be proactive before we actually detect such a thing, and we have to build a security infrastructure anyway ;) -- Vincent

Vincent Hanquez
You have to start somewhere with security.
Yes. And you should start with assessing how much cost and inconvenience you are willing to suffer for the improvement in security you gain. In this case, my assertion is that the marginal worsening of security by having a mirror of hackage even without signing of packages etc., is less than the marginal improvement in usability. I'm a bit surprised to find that there seems to be a lot of opposition to this view, but perhaps the existing structure is more secure than I thought? Or the benefit of a mirror is exaggerated - I can see how it would be annoying to have hackage down, but it hasn't happened to my, so perhaps those complaining about it just were very unlucky.
Whereas on a mirror, it would be completely transparent to the users.
Well - you could easily compare packages from the main repo and its mirror to verify the integrity. This isn't a lot harder than checking the details of the stuff cabal-install pulls in (which I admittedly never do either).
As a first step, having the hackage server and its users trusted, is hopefully reasonable.
Hard to evaluate before there is a concrete proposal - security is always a trade off, and you need to know what you get and what you pay. If you can outline the structure of how this could work, I'm happy to bikeshed it. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Wed, Dec 08, 2010 at 11:41:31AM +0100, Ketil Malde wrote:
Vincent Hanquez
writes: You have to start somewhere with security.
Yes. And you should start with assessing how much cost and inconvenience you are willing to suffer for the improvement in security you gain. In this case, my assertion is that the marginal worsening of security by having a mirror of hackage even without signing of packages etc., is less than the marginal improvement in usability.
I'm a bit surprised to find that there seems to be a lot of opposition to this view, but perhaps the existing structure is more secure than I thought? Or the benefit of a mirror is exaggerated - I can see how it would be annoying to have hackage down, but it hasn't happened to my, so perhaps those complaining about it just were very unlucky.
Having one glaring security problem is not a good reason to introduce another one. It just makes more to fix. As for mirroring, I'm all in favor of any random user doing a mirror. The only place I see a problem is making those "official" mirrors. If you were to mirror and announce that you had one then I can trust you or not. There are some people I would trust to have valid mirrors. Darrin

On Wed, Dec 8, 2010 at 5:41 AM, Ketil Malde
I'm a bit surprised to find that there seems to be a lot of opposition to this view, but perhaps the existing structure is more secure than I thought?
The difference is in the ability to influence other packages and metadata, I think. You could upload a trojan to Hackage right now, but who would ever install it? You could go to the effort of becoming responsible for a package that people do use and then slip the trojan in later, but the update to the package will still be visible and--since this is now a package that people actually use--some do-gooder will probably stumble on your nefarious plot in the process of simple compatibility checking or such. On the other hand, by running a malicious mirror, nothing stops you from inserting (unsafePerformIO installRootKit) into the bytestring package with no indication of a change. All of this applies equally to Hackage as it stands, of course, the difference being the implicit trust the community puts in the people with administrative power over it. If someone else who already has that degree of informal trust put up a mirror I don't think anyone would have a problem using it. As always security is a matter of degree, but Hackage is just high-profile enough that a bit of care is probably warranted. And I suspect that most worthwhile interim solutions to add a bit of trust for mirrors would be almost as much effort as a complete solution. - C.

On Wed, Dec 8, 2010 at 8:29 AM, C. McCann
On Wed, Dec 8, 2010 at 5:41 AM, Ketil Malde
wrote: I'm a bit surprised to find that there seems to be a lot of opposition to this view, but perhaps the existing structure is more secure than I thought?
The difference is in the ability to influence other packages and metadata, I think. You could upload a trojan to Hackage right now, but who would ever install it?
I could upload a new version of mtl if I wanted. Plenty of people would install it. Luke

On 08/12/10 20:25, Luke Palmer wrote:
I could upload a new version of mtl if I wanted. Plenty of people would install it.
Correct me if i'm wrong; You would appear in the UploadedBy, and then you might be challenged by the traditional uploaders or attentive users (most users wouldn't know of course) to give a reason of doing the upload. -- Vincent

On 08/12/10 10:41, Ketil Malde wrote:
Yes. And you should start with assessing how much cost and inconvenience you are willing to suffer for the improvement in security you gain. In this case, my assertion is that the marginal worsening of security by having a mirror of hackage even without signing of packages etc., is less than the marginal improvement in usability.
I'm a bit surprised to find that there seems to be a lot of opposition to this view, but perhaps the existing structure is more secure than I thought? Or the benefit of a mirror is exaggerated - I can see how it would be annoying to have hackage down, but it hasn't happened to my, so perhaps those complaining about it just were very unlucky.
You might have misunderstood what I was talking about. I'm proposing signing on the hackage server on reception of the package, where it can be verified by cabal that the package hasn't been signed properly. This is not about all the way signing of every uploaders, with chain of trust and such (which has been proposed by wren). The implication on the users should be minimal. I mean they shouldn't even know about it. It would only complain if the signature isn't valid. -- Vincent

My take on the issue is that we should make it possible to easily mirror hackage (what the OP asked for), so that people could use it when they wanted to, and have a list of the mirrors on the wiki. This way those who are interested can use them. Like when the mirror is faster/closer to them or to help out when hackage is temporarily down. Those who need the security can choose not to use mirrors, or make their own (private), or develop a secure scheme, when it doesn't exist yet. It's perfectly understandable, that people doing work/serious stuff need the guarantees, but I bet a many of us are just playing around and developing things for themselves. -- Markus Läll

On 10/12/2010, at 12:18 AM, Markus Läll wrote:
My take on the issue is that we should make it possible to easily mirror hackage (what the OP asked for), so that people could use it when they wanted to, and have a list of the mirrors on the wiki. This way those who are interested can use them. Like when the mirror is faster/closer to them or to help out when hackage is temporarily down. Those who need the security can choose not to use mirrors, or make their own (private), or develop a secure scheme, when it doesn't exist yet.
Have I misunderstood something? I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.

Richard O'Keefe
I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
At the very least, this assumes that you trust all the mirror operators. Sure, I'm trustworthy, but how about those other guys? >:) -=rsw

On 10/12/2010, at 10:50 AM, Riad S. Wahby wrote:
Richard O'Keefe
wrote: I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
At the very least, this assumes that you trust all the mirror operators.
Sure, I'm trustworthy, but how about those other guys? >:)
See the words "some sort of protocol between X and Y"? This means that Y has to be authenticated to X and X to Y and they use some sort of encryption scheme that prevents man-in-the-middle attacks. Right now, of course, nothing whatever stops someone building a 'robot' at X to visit Y periodically and update X; the missing piece is any kind of accreditation at Y.

On Thu, Dec 9, 2010 at 11:04 PM, Richard O'Keefe
On 10/12/2010, at 12:18 AM, Markus Läll wrote:
My take on the issue is that we should make it possible to easily mirror hackage (what the OP asked for), so that people could use it when they wanted to, and have a list of the mirrors on the wiki. This way those who are interested can use them. Like when the mirror is faster/closer to them or to help out when hackage is temporarily down. Those who need the security can choose not to use mirrors, or make their own (private), or develop a secure scheme, when it doesn't exist yet.
Have I misunderstood something? I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
Yes, that's what I think of mirrors too. I don't know if that was what you meant, but yes those mirrors would be just passive copies of the real hackage server (no updates from a user), and serve as a place to download packages from until the original hackage comes back. But for the security issue, ofcourse any host of a mirror could abuse that. But I think for non-critical stuff I wouldn't mind using the mirror if it has shown to be trustworthy. And for people using Haskell a lot, if the making of your own mirror is as simple as installing some package on your webserver and running it, then this would be a great remedy against those hours when something has happened to hackage.. -- Markus Läll

On 9 December 2010 21:04, Richard O'Keefe
On 10/12/2010, at 12:18 AM, Markus Läll wrote:
My take on the issue is that we should make it possible to easily mirror hackage (what the OP asked for), so that people could use it when they wanted to, and have a list of the mirrors on the wiki. This way those who are interested can use them. Like when the mirror is faster/closer to them or to help out when hackage is temporarily down. Those who need the security can choose not to use mirrors, or make their own (private), or develop a secure scheme, when it doesn't exist yet.
Have I misunderstood something? I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
That's certainly what we've been planning on, that anyone can run a mirror, no permissions needed. The issue people have raised is what about having public mirrors that are used automatically or semi-automatically by clients. The suggestion about DNS round robin is transparent to clients but requires all the mirrors to be a master, or to have some forwarding system. Any transparent system also needs trust. My opinion is that at this stage it is not really worth doing anything complicated. We do not yet have a bandwidth problem. Once there are more (unpriviledged) public and private mirrors then temporary downtime on the main server is less problematic. Eventually we'll get a bandwidth problem but I think we've got a fair bit of time to prepare and in the mean time we can get simple unpriviledged mirroring working. That is mostly an issue of specifications and tools. The spec for package archives is not as clear or as good as we'd like. We've been discussing it recently on the cabal-devel mailing list. Duncan

Hello all,
Right now I'm trying to answer a simple question:
- Would the current Haskell.org / hackage infrastructure benefit from
the donation of a dedicated VM with good bandwidth/uptime?
Whoever already knows how to do this could configure it.
In trying to answer the above question I found this long email thread from
1.5 years ago. Duncan said the following:
On Thu, Dec 9, 2010 at 6:47 PM, Duncan Coutts
That's certainly what we've been planning on, that anyone can run a mirror, no permissions needed. The issue people have raised is what about having public mirrors that are used automatically or semi-automatically by clients.
Are there any updates to this in the last year? Is anybody running a mirror? The other reason I've been thinking about this is the scoutess project. More public testing or continuous integration facilities would require more hardware resources. -Ryan

Hi,
On Thu, Apr 19, 2012 at 5:12 PM, Ryan Newton
- Would the current Haskell.org / hackage infrastructure benefit from the donation of a dedicated VM with good bandwidth/uptime?
I can think about at the very least one project (the one you mention
below) that would benefit from it. But I think there are a *lot* more that I don't know about too.
Are there any updates to this in the last year? Is anybody running a mirror?
I know about http://hackage.factisresearch.com/ and http://hackage2.uptoisomorphism.net/ but they both run Hackage2.0 I think.
The other reason I've been thinking about this is the scoutess project. More public testing or continuous integration facilities would require more hardware resources.
Yes. We have talked about this with Duncan. He was wondering whether there was a way to get scoutess to handle the "build bot" part of Hackage2.0 and we will develop it so that it can. However, with Jeremy we intend to let people distribute their builds on several machines so you will not be forced to have one machine do all the work. Of course we are not there yet, but I thought you would appreciate hearing about what is planned for scoutess. -- Alp

On 19 April 2012 08:12, Ryan Newton
Hello all,
Right now I'm trying to answer a simple question:
Would the current Haskell.org / hackage infrastructure benefit from the donation of a dedicated VM with good bandwidth/uptime?
Whoever already knows how to do this could configure it.
In trying to answer the above question I found this long email thread from 1.5 years ago. Duncan said the following:
On Thu, Dec 9, 2010 at 6:47 PM, Duncan Coutts
wrote: That's certainly what we've been planning on, that anyone can run a mirror, no permissions needed. The issue people have raised is what about having public mirrors that are used automatically or semi-automatically by clients.
Are there any updates to this in the last year? Is anybody running a mirror?
I am. http://hackage.scs.stanford.edu/
The other reason I've been thinking about this is the scoutess project. More public testing or continuous integration facilities would require more hardware resources.
The computer it's running on has 16 cores and 48GB of ram. I have access to a few other computers like this. Cheers, David
-Ryan
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Oh yes, it's hackage2... not hackage1.
On 19 April 2012 11:50, David Terei
On 19 April 2012 08:12, Ryan Newton
wrote: Hello all,
Right now I'm trying to answer a simple question:
Would the current Haskell.org / hackage infrastructure benefit from the donation of a dedicated VM with good bandwidth/uptime?
Whoever already knows how to do this could configure it.
In trying to answer the above question I found this long email thread from 1.5 years ago. Duncan said the following:
On Thu, Dec 9, 2010 at 6:47 PM, Duncan Coutts
wrote: That's certainly what we've been planning on, that anyone can run a mirror, no permissions needed. The issue people have raised is what about having public mirrors that are used automatically or semi-automatically by clients.
Are there any updates to this in the last year? Is anybody running a mirror?
I am.
http://hackage.scs.stanford.edu/
The other reason I've been thinking about this is the scoutess project. More public testing or continuous integration facilities would require more hardware resources.
The computer it's running on has 16 cores and 48GB of ram. I have access to a few other computers like this.
Cheers, David
-Ryan
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, 2012-04-19 at 11:12 -0400, Ryan Newton wrote:
Hello all,
Right now I'm trying to answer a simple question: * Would the current Haskell.org / hackage infrastructure benefit from the donation of a dedicated VM with good bandwidth/uptime? Whoever already knows how to do this could configure it.
In trying to answer the above question I found this long email thread from 1.5 years ago. Duncan said the following:
On Thu, Dec 9, 2010 at 6:47 PM, Duncan Coutts
wrote: That's certainly what we've been planning on, that anyone can run a mirror, no permissions needed. The issue people have raised is what about having public mirrors that are used automatically or semi-automatically by clients. Are there any updates to this in the last year? Is anybody running a mirror?
Yes, we're running a public testing instance of the new hackage server at: http://hackage.factisresearch.com/ It has live mirroring running. This is in a VM donated by factis research, at least on a temporary basis to help with the testing of the new hackage server code. I think the answer for the longer term is still yes. We have not yet discussed with Galois if the new hackage server should be hosted on their infrastructure. The new code does take more resources and is not based on apache, so it may not be appropriate to host it on the same machine as is currently used. There's two options I think: 1. a machine for the central hackage server, 2. a machine for doing package builds The former will require more organisation, partly because we need the haskell.org people to have some degree of control over the system. The latter is easier because the design allows for multiple clients to do builds rather than just one central machine. So all that requires is a user account to upload the data. (plus the small matter of a working build bot client software, which is where scoutess may help) Duncan

There's two options I think: 1. a machine for the central hackage server, 2. a machine for doing package builds
The former will require more organisation, partly because we need the haskell.org people to have some degree of control over the system. The latter is easier because the design allows for multiple clients to do builds rather than just one central machine. So all that requires is a user account to upload the data. (plus the small matter of a working build bot client software, which is where scoutess may help)
I wonder if this could get to the point where it could be done seti-at-home style, farmed out via a VM image. That is people would run the image to provide resources (and geographic distribution) to the build server cloud. Maybe they get a fast local mirror as a reward. If it were every that easy I would certainly love to run a VM! -Ryan

I wonder if this could get to the point where it could be done seti-at-home style, farmed out via a VM image. That is people would run the image to provide resources (and geographic distribution) to the build server cloud. Maybe they get a fast local mirror as a reward.
If it were every that easy I would certainly love to run a VM!
Surprisingly BOINC seems to *not* be virtualized and instead just runs native applications.

On 12/9/10 4:04 PM, Richard O'Keefe wrote:
On 10/12/2010, at 12:18 AM, Markus Läll wrote:
My take on the issue is that we should make it possible to easily mirror hackage (what the OP asked for), so that people could use it when they wanted to, and have a list of the mirrors on the wiki. This way those who are interested can use them. Like when the mirror is faster/closer to them or to help out when hackage is temporarily down. Those who need the security can choose not to use mirrors, or make their own (private), or develop a secure scheme, when it doesn't exist yet.
Have I misunderstood something? I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
The security issue is how does a client, C, know to trust X (maybe X is evil) or know to trust the transmission of data from Y to X (maybe a man in the middle corrupted things and X has become a confused deputy), etc. The concern isn't for the consistency of Y's data, it's for the consistency of X's data as a replica of Y's. -- Live well, ~wren

On 12/11/10 5:59 AM, wren ng thornton wrote:
On 12/9/10 4:04 PM, Richard O'Keefe wrote:
As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
The security issue is how does a client, C, know to trust X (maybe X is evil) or know to trust the transmission of data from Y to X (maybe a man in the middle corrupted things and X has become a confused deputy), etc.
P.S., X can't really be a "confused deputy" here since X has no special privileges[1], rather X would become more of a confused librarian: y'know, the kindly old but somewhat senile librarian who occasionally mistakes your requests (like that time they gave you Cujo when you asked for a book on the care and feeding of pets, or the time they gave you some writings by the Marquis de Sade when you were doing research for your anatomy class). [1] The implicit trust C has for X usually isn't counted as a "privilege" in the security world. -- Live well, ~wren

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/9/10 16:04 , Richard O'Keefe wrote:
I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
The above assumes that the operator of the mirror is trustworthy. It wouldn't be difficult for a hostile party to set up a mirror, but then modify the packages to include malware payloads --- if the packages aren't signed. (Or even if they are signed if it's a sufficiently weak algorithm. MD5 is already unusable for the purpose.) Other possibilities include MITM attacks where the hostile party detects that someone is attempting to download a package and spoofs a reply that directs it to a different package. (Or more complex tricks; see http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.167.4096&rep=rep1&type=pdf for examples.) - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk0D1jcACgkQIn7hlCsL25V3dQCfZ4zdF9KXNNS7bA35CL33e00q FzUAnAvQiRhElO/86qgagtKzv/cwgQfJ =DxV9 -----END PGP SIGNATURE-----

On Sat, Dec 11, 2010 at 19:51, Brandon S Allbery KF8NH
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 12/9/10 16:04 , Richard O'Keefe wrote:
I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
The above assumes that the operator of the mirror is trustworthy. It wouldn't be difficult for a hostile party to set up a mirror, but then modify the packages to include malware payloads --- if the packages aren't signed. (Or even if they are signed if it's a sufficiently weak algorithm. MD5 is already unusable for the purpose.)
How about, as a cheep and cheerful method to get up running. If the premise is that the original server is trustworthy and the mirrors aren't, then: 1) Hash all packages on the original server. 2) Hash goes into a side car file (e.g. <packagename>.sha) that lives next to the package 3) Modify cabal so that it can install from a mirror, but always gets the hash from the original server. 4) Before install you check the hash is correct. This gives you a few things: 1) Every package downloaded from a mirror is guarenteed to be the same as downloading from the original server. This seems to avoid most peoples security concern. 2) Although there's a transfer from the central server for every download, it's low bandwidth, so he majority of the load is tranfered to the mirror. 3) If the central server goes down a user could elect to ignore the hash, and still get the package. If this isn't enough then you're down the road of a GPG based solution. Setting up some signing keys for packages, distributing the public halves to all clients, etc, etc... If that's the road you want I'd suggest looking at how Debian solved the problem. http://wiki.debian.org/SecureApt

On 12/13/10 8:25 AM, Paul Sargent wrote:
How about, as a cheep and cheerful method to get up running. If the premise is that the original server is trustworthy and the mirrors aren't, then:
1) Hash all packages on the original server. 2) Hash goes into a side car file (e.g.<packagename>.sha) that lives next to the package
I still contend that we shouldn't have to trust the central server either. The hash can be created alongside the sdist on the maintainer's computer, and then both are uploaded to central. Thus, the maintainer can verify that the hash on central matches their own, which ensures that: (a) the hash that central has is trustworthy (b) no man-in-the-middle corrupted the sending of the hash to central These concerns are separate from using the hash to confirm the consistency of the sdist itself. Remember: metadata can be compromised just as easily as data. And the fewer machines we have to trust, the better. Moreover, this approach requires the same amount of implementation work as getting central to make the hashes. -- Live well, ~wren

On 14/12/2010, at 2:25 AM, Paul Sargent wrote:
On Sat, Dec 11, 2010 at 19:51, Brandon S Allbery KF8NH
wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/9/10 16:04 , Richard O'Keefe wrote:
I thought "X is a mirror of Y" meant X would be a read-only replica of Y, with some sort of protocol between X and Y to keep X up to date. As long as the material from Y replicated at X is *supposed* to be publicly available, I don't see a security problem here. Only Y accepts updates from outside, and it continues to do whatever authentication it would do without a mirror. The mirror X would *not* accept updates.
The above assumes that the operator of the mirror is trustworthy. It wouldn't be difficult for a hostile party to set up a mirror, but then modify the packages to include malware payloads --- if the packages aren't signed. (Or even if they are signed if it's a sufficiently weak algorithm. MD5 is already unusable for the purpose.)
True, but right now we're vulnerable to man-in-the-middle attacks, DNS spoofing, and a whole lot of other things. If there is any way to be sure that what I see when I visit hackage.haskell.org is the *real* hackage, my browser doesn't know about it.
How about, as a cheep and cheerful method to get up running. If the premise is that the original server is trustworthy and the mirrors aren't, then:
1) Hash all packages on the original server. 2) Hash goes into a side car file (e.g. <packagename>.sha) that lives next to the package 3) Modify cabal so that it can install from a mirror, but always gets the hash from the original server. 4) Before install you check the hash is correct.
This suffers from two problems. A. I am willing to grant that the original server is trustworthy, but "DNS lookup gives me the address of the original server and not a spoofer" seems every bit as dodgy an assumption as the trustworthiness of the mirrors. B. Wasn't the original motivation for wanting mirrors *availablity*? If you have to get the hash from the original server and the original server is down, then having a mirror has done you no good at all. Perhaps someone on this list understands what CRAN does could explain it here. I know that the R install.packages(...) command goes through mirrors.

On 9 December 2010 20:55, Vincent Hanquez
You might have misunderstood what I was talking about. I'm proposing signing on the hackage server on reception of the package, where it can be verified by cabal that the package hasn't been signed properly.
By "cabal", are you referring to Cabal or cabal-install? If the former, then I'm not sure how exactly it would do such verification since it doesn't have any notion of the internet as far as I'm aware; if the latter then it means absolutely nothing for those of us that do not use cabal-install for most packages. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On Thu, Dec 09, 2010 at 10:45:39PM +1100, Ivan Lazar Miljenovic wrote:
On 9 December 2010 20:55, Vincent Hanquez
wrote: You might have misunderstood what I was talking about. I'm proposing signing on the hackage server on reception of the package, where it can be verified by cabal that the package hasn't been signed properly.
By "cabal", are you referring to Cabal or cabal-install? If the former, then I'm not sure how exactly it would do such verification since it doesn't have any notion of the internet as far as I'm aware; if the latter then it means absolutely nothing for those of us that do not use cabal-install for most packages.
I don't really know the difference between Cabal and cabal-install, but Something is downloading the .tar.gz, and that thing can always download an extra .tar.gz.sign file which contains a way to verify that's the .tar.gz is genuinely the one that has been received by hackage. For those not using the thing-that-download-archive to get their package from hackage, they can build the same mechanism that download an extra file, and check the signature. Or they can even choose not to bother, and just download the package as they just did before. Note that, I'm not actually inventing anything new here, this is a common way to distribute software (linux distributions, many opensource softwares, etc). -- Vincent

Vincent Hanquez
You might have misunderstood what I was talking about. I'm proposing signing on the hackage server on reception of the package,
Okay, fair enough. You can't *enforce* this, of course, since I might work without general internet access but a local mirror, but you could require me to run 'cabal --dont-check-signatures' or similar, so this would still make a hostile-operated mirror less useful. OTOH, if I should suggest improving the security of Hackage, I would prioritize: a) email the maintainer whenever a new upload is accepted - preferably with a notice about whether the build works or fails. Mabye also highlight the case when maintainer differs from uploader - if that doesn't give a ton of false positives. b) email the *previous* maintainer when a new upload is accepted and the maintainer field has changed. This way, somebody is likely to actually *notice* when some evil person uploads a trojan mtl or bytestring or whatever. The downside is more mail, and the people who run Hackage have been wary about this. So perhaps even this is on the wrong side of the cost/benefit fence. (People with admin privileges (staff or hackers) to hackage can of course still work around everything - crypto signatures or email-schemes.) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Thu, Dec 9, 2010 at 7:04 AM, Ketil Malde
Vincent Hanquez
writes: You might have misunderstood what I was talking about. I'm proposing signing on the hackage server on reception of the package,
Okay, fair enough. You can't *enforce* this, of course, since I might work without general internet access but a local mirror, but you could require me to run 'cabal --dont-check-signatures' or similar, so this would still make a hostile-operated mirror less useful.
OTOH, if I should suggest improving the security of Hackage, I would prioritize:
a) email the maintainer whenever a new upload is accepted - preferably with a notice about whether the build works or fails. Mabye also highlight the case when maintainer differs from uploader - if that doesn't give a ton of false positives.
b) email the *previous* maintainer when a new upload is accepted and the maintainer field has changed.
This way, somebody is likely to actually *notice* when some evil person uploads a trojan mtl or bytestring or whatever. The downside is more mail, and the people who run Hackage have been wary about this. So perhaps even this is on the wrong side of the cost/benefit fence.
(People with admin privileges (staff or hackers) to hackage can of course still work around everything - crypto signatures or email-schemes.)
-k
Also, perhaps put the signatures on a separate machine from the one containing .tar.gz. For a 3rd party to corrupt a package, they'd need to hack 2 machines.

On 9 December 2010 23:04, Ketil Malde
[snip] This way, somebody is likely to actually *notice* when some evil person uploads a trojan mtl or bytestring or whatever. The downside is more mail, and the people who run Hackage have been wary about this. So perhaps even this is on the wrong side of the cost/benefit fence.
Whilst we're wishfully thinking, why not register a package to an individual or a group, and only allow them to upload it? Possibly linking this to a group on community.haskell.org or something. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On 4 December 2010 16:31, Dan Knapp
With Hackage down, now seemed like a good time to push this issue again. It's such an important site to us that it's really rather a shame there are no mirrors of it. I have a personal-and-business server in a data center in Newark, with a fair chunk of bandwidth, which I'd like to offer for a permanent mirror. Is there interest in this? Who do I need to talk to for it to happen?
At the recent hackathon we were working on hackage mirroring. By this we do not mean just using rsync to sync the current combination of filestore + cgi programs that make up the current hackage implementation. We want to make it easy to set up dumb or smart package archives and to do nearly-live mirroring. We have a pototype hackage-mirror client that can poll two servers and copy packages from one instance to the other. This assumes the target is a smart mirror (e.g. an instance of the new hackage-server impl). We also need to be able to target local "dumb" mirrors that are just passive collections of files.
Strategy-wise, I think the best approach is round-robin DNS, since that's transparent to the end user - everything would still appear at the URL it's at now, but behind-the-scenes magic would let things keep working when one or the other site is down. I haven't personally set up such a system before but I'm willing to take on the burden of figuring it out.
This is a somewhat orthogonal issue since I think you're talking about multiple master smart servers that can accept uploads. Duncan
participants (21)
-
Alp Mestanogullari
-
Brandon S Allbery KF8NH
-
C. McCann
-
Dan Knapp
-
Darrin Chandler
-
David Terei
-
Duncan Coutts
-
Florian Lengyel
-
Ivan Lazar Miljenovic
-
Jake McArthur
-
Ketil Malde
-
Lally Singh
-
Luke Palmer
-
Markus Läll
-
Ozgur Akgun
-
Paul Sargent
-
Riad S. Wahby
-
Richard O'Keefe
-
Ryan Newton
-
Vincent Hanquez
-
wren ng thornton