Suppose you are downloading a new fun game for your computer and you want to know whether it is going to do what it claims (clicking on cows, let's say) or whether it is going to send all your data (credit card numbers you type, let's say) to a shadowy cabal in Martha's Vineyard or Napa Valley or whereever shadowy cabals are found these days. For the sake of argument, let's say that you have heard good things about the a (hypothetical) open source project called FreeCowClicker2 written by Ilia Bogomips and you want to try it out.
Well, in some cases the authors of FreeCowClicker2 might run a download site and you might get it there, but most of the time you'll probably be getting it from a package repository, such as a linux distribution, a programming-language-specific repository such as CPAN (perl), PyPI (python), rubygems (ruby), or something like addons.mozilla.org. How do I know I'm getting the package I want, if (a) I am connecting to a potentially dodgy WiFi access point or there is some other way in which the shadowy cabal has gotten into my network, or (b) one of the servers involved in serving up the files, or mirroring them, is under the control of our shadowy cabal?
If you are a little bit familiar with this stuff, you are probably saying "signed packages", as found in for example Fedora or Debian. And that indeed is what I'm getting at, specifically TUF (The Update Framework). TUF aims to be usable by any package repository, but the most effort to date has been to using it for PyPI.
As part of Square's hack week which just concluded, a number of us looked into using TUF with rubygems, and wrote some code to that end. Hopefully that code helps clarify what this is all about, and there is a fair bit of documentation on the TUF site, so I'll just mention a few of the high points and interesting details:
- TUF can upgrade its keys. Your package installer might find there are new keys (signed by the old keys) and switch to them.
- There are multiple keys for different things. Anyone who contributes a package can have a key which is just good for signing that one package, there are separate keys which are used to say what packages were released as of a given time, and there are keys which are just used to sign the frequently used keys. Some of the keys can be kept offline, and only used maybe a few times a year.
- TUF is fairly easy to work with. The public keys and signatures and such are kept in JSON, which means you can parse them with, say, ruby or jq.
- One little example of something they thought of: when you get a reference to another TUF file you also get a signed length. That way an attacker can't substitute a multi-gigabyte file and cause a denial of service by tying up your network or computer. You just need to download as many bytes as the signature told you to, and can quit after that.
There's plenty left to do to finish the job of getting rubygems to use TUF, so go look at the pull request and start pitching in if you feel so inclined. Based on a week of looking at it, TUF does look like a solid basis for a more secure rubygems which preserves all the cool things about rubygems like letting people release gems often, letting anyone author a gem, etc. Likewise, it also seems promising for other package repositories.