
Fellow members of the shadowy Cabal, I've been looking at making the mirror client supply the original uploading user/upload time when it mirrors a package. I've developed an approach (attached) that, rather than PUTting a simple tarball to mirror, PUTs a combination of the tarball, user name and upload date in multipart/form-data format. This works (though the code is a bit grungy still). The only thing I'm worrying about before I tidy it up and commit is whether it is sufficiently RESTful: it feels a bit weird that the thing that you PUT is not what you get back from making a GET request to that URL. An alternative approach would be to expose resources for the uploader/upload time of a package, which the mirror client could then simply PUT to. Anyone have any thoughts on the right approach here? Max

On Mon, 2011-10-17 at 10:09 +0100, Max Bolingbroke wrote:
Fellow members of the shadowy Cabal,
I've been looking at making the mirror client supply the original uploading user/upload time when it mirrors a package.
I've developed an approach (attached) that, rather than PUTting a simple tarball to mirror, PUTs a combination of the tarball, user name and upload date in multipart/form-data format. This works (though the code is a bit grungy still). The only thing I'm worrying about before I tidy it up and commit is whether it is sufficiently RESTful: it feels a bit weird that the thing that you PUT is not what you get back from making a GET request to that URL.
Yeah, I had the same reaction when considering this previously.
An alternative approach would be to expose resources for the uploader/upload time of a package, which the mirror client could then simply PUT to.
Yes. I think that is the right thing to do. In principle it's hardly any more expensive (given pipelined http requests) and it's a nicer design. There's some other issues to consider. How does the mirror client pick the user account and make sure it exists. This turns into the more general problem of mapping user accounts between domains. Sadly I don't think there is one policy that fits all circumstances. It depends on what the user knows about the relationship between the servers they're mirroring between. If no policy is given, probably a reasonably default is to not set any uploader account and to just set the upload time (probably that means we should have the uploader account be Nothing rather than set to the mirroring client). The new server identifies user accounts by id. User names are permitted to change, but the userid remains the same (like unix accounts). It exposes both the user id and name in the package index. I'm not sure if we currently expose the set of users and names in some other useful way (ie a single resource providing a name <-> uid mapping). The old server identifies users only by name. For mirroring from the old server I think the sensible policy is: * take the username from the old server * look it up in the user db on the new server * if it exists, assume these are corresponding accounts * if it does not exist, create a new disabled user account with that name * set the chosen account as the package uploader Another policy that would work for mirroring between new server implementations is to assume the user ids match, or to make use of a supplied uid mapping table. If we don't find a corresponding account and don't want to use a policy of creating disabled accounts then it's probably best to use no uploader name, and just set the upload time. Initially I think we only need the "null" policy of setting upload time only, and the policy useful for a live public mirror of the central hackage server. Duncan

On 18 October 2011 20:17, Duncan Coutts
Initially I think we only need the "null" policy of setting upload time only, and the policy useful for a live public mirror of the central hackage server.
I've implemented this: * Packages default to being uploaded by the mirrorer at the standard time, just as now * Two URLs exist (e.g. /package/edit-distance-0.2.1/edit-distance-0.2.1.tar.gz/upload-time and /package/edit-distance-0.2.1/edit-distance-0.2.1.tar.gz/uploader) that any mirrorer can PUT to to set these two bits of information * The mirror client does make use of these URLs * PUTting a user name that does not currently exist on the server creates an historical account I didn't take the extra step of using a Maybe type to explicitly distinguish packages with no uploader other than the mirrorer. BTW, what is the motivation for being able to rename users? This extra layer of indirection makes quite a few things more ugly (e.g. currently we can only build URLs by *username* - there are no URLs to select users by id), and is fairly unusual - in my experience very few sites allow account renaming. Max
participants (2)
-
Duncan Coutts
-
Max Bolingbroke