
Along the lines of http://blog.patch-tag.com/2010/03/13/mirroring-patch-tag/ for downloading all patch-tag.com repositories, I've begun to wonder how to download all Github repositories since more and more people seem to be using it. Nothing in http://develop.github.com/ seems especially useful for grabbing the git:// URLs of all repos by language - just by user. The only real list of repos by language seems to be gotten at via http://github.com/languages/Haskell/updated or http://github.com/languages/Haskell/created . (You might think http://github.com/languages/Haskell would be good, but no, it's just a few random repos by interest and not a full listing.) I looked at the HTML, and it looks possible to use tagsoup to get all 98 pages and then parse the entries to get the HTTP URLs of the repos, and then turn *that* into git:// URLs suitable for shelling out to 'git clone', but I can't help but wonder if maybe there's a better approach someone more familiar with Github would know. -- gwern