A few days ago, in Squaring Zooko’s Triangle, I sketched out a proposal for a new naming scheme which to a limited extent achieved all of the contradictory-seeming aims set out in Zooko’s Triangle. Having discussed the essay with a few folk since, it seems like it might be worthwhile trying to clarify the ideas behind it.
(Updated 2012-04-27 to use Tahoe-LAFS’s superior terminology, and correct some related errors.)
A brief aside before we start: Zooko’s Triangle is actually a tetrahedron. There are four desirable properties of a name, and we know how to achieve any three of these four. You hear it discussed as a triangle because authors take one of the four points as so essential and non-negotiable that it isn’t even discussed, but different authors take different points as essential so it can be confusing to read multiple accounts of it. The four desiderata are
Sacrifice any one of these and there’s an existing system that provides the other three.
In this framework, the original article on Zooko’s Triangle assumes transferability is non-negotiable, Clay Shirky’s essay assumes that specificity is non-negotiable. Petnames assumes that authoritylessness is non-negotiable, and describes a system that tries to achieve all three goals by switching seamlessly between the three different kinds of names that remain.
Now back to my proposal, which attempts to achieve all four. This proposal combines three ideas towards this goal.
I definitely don’t claim that what I propose here does away with the need for systems like Petnames, by the way; it has its own disadvantages and is meant only to complement those systems.
We start with PGP key fingerprints, which look like this:
Specific, decentralized, transferrable, and totally unmemorable.
So, the first idea is to use word lists instead of hexadecimal to improve readability. I went to Diceware for my wordlist, since that wordlist is specifically meant to provide a long list of short memorable words for cryptographic purposes. Perhaps there are other suitable lists – do let me know – but that one is certainly good enough for a demonstration of principle. I stripped out all the ones containing special characters, since they are inconvenient for a variety of reasons and it only makes the list slightly shorter. The resulting list contains 7322 words, so you can encode about 12.8 bits with each word, and 160 bits comes to 13 words:
Well, that’s an improvement, but it’s still a long way from what I’d call “memorable”. It might be memorable if we could cut it down to something much shorter – I aimed for five words:
Now that’s getting close to memorable. It’s also completely insecure. At five words, it’s encoding about 64 bits. If that’s your name, and I want to imitate you, I have to generate and hash 264 public keys before I find one that matches this one; because it’s in bulk the public key generation can be done very cheaply, so the only real cost is the hash. This is comparable to the effort involved in the distributed RC5-64 attack which was successfully completed five years ago and would be perhaps ten times easier today.
Actually, the situation is much worse than this, but I’ll come to that in a moment.
We can foil this attack by increasing the number of words, but at the cost of memorability. Instead we make this attack more expensive in a different way: we increase the cost of generating a name given a public key. One possible technique would be something like key stretching; we could subject the key to an iterated hash with many steps before generating the name from it. If we used for example 224 steps, then generating a name would take 10-20 seconds – acceptable, since generating names is rare – while the difficulty of our attack goes up to 288 steps, well outside the range of even the best funded attacker today.
Unfortunately, this approach also slows down verifying a public key given a name to 10-20 seconds, and that’s much more of a problem because that’s a frequent operation. On a small device like a phone, it would be much worse; again, you can probably sacrifice generating names on your phone but you won’t want to sacrifice verifying them. Instead, the second idea: we use a proof-of-work based scheme. A more complete proposal should probably use one of the more sophisticated proof-of-work schemes out there; I saw a great one recently involving function inversion to force a minimum memory usage on the attacker, increasing the cost of a parallel hardware attack at no cost to real users, but I can’t now find the reference. Again, for proof of concept, I’ve used a pretty simple scheme.
In this scheme, you hash the public key together with a nonce to generate the name, and the combination is only valid if the last 24 bits of the hash are all zero; then you use the first 64 bits to generate the name. You’ll have to try about 224 different nonces before you find a valid one, so name generation takes 10-20 seconds as before, but now you supply the nonce along with the public key to someone verifying your name, and they can verify the name instantly with a single hash function invocation, making sure to check that you’ve done your work and the last 24 bits are zero.
This plan has another advantage too: there’s now more than one name for each public key. That means that if the name the system gives you is a particularly unfortunate choice of words, you can just run it again, wait another 10-20 seconds, and get another one. I suspect this is pretty much an essential property of any such scheme.
We’re not quite finished, because our attacker has another trick up their sleeve: multiple-target attacks. All of the foregoing assumes they’re trying to imitate one person: you. Perhaps they want to imitate either you or your sister? Now every name they generate has two chances of matching – one against your name, and one against your sisters – and so on average it will take them half as long to find a match. What if they’re happy to match any of the 26,000 people who work at the Pentagon? What if they’re happy to match anyone in the world? In a world where public keys are in widespread use by nearly everyone, this could make their attack easier by a factor of 232 – this probably isn’t the most realistic attack scenario, but in crypto it pays to be pessimistic, and this brings the cost of the attack down to a very achievable 256 steps, even given the countermeasures described above.
Again, we could prevent that by making the name longer – another two components pushes the cost back up to over 281. However, we can make a lesser sacrifice of memorability with the third idea: we add a component to the name that you choose. Your name might be a good choice, but you can choose a pseudonym or a handle so long as you think that other people making the same choice will be reasonably rare. We hash this “handle” in along with the nonce and the public key. Now our attacker can’t attack all the names at once – only all the names with the same handle, which hopefully is a much smaller subset. Even if there are a dozen John Smiths working at the Pentagon, the gain for our attacker is vastly reduced – they’re still looking at a work factor of over 284 to hit any of them. Of course our attacker can make up as many names with this prefix as they like, but it doesn’t help them much, because they’ll just end up attacking their own names rather than the names they care about. And this big improvement in security comes at a relatively small price in memorability, since the “handle” is chosen to be meaningful to both sender and recipient.
You can improve memorability further by using a mnemonic. I’ve been able to memorize the randomly generated name I used as an example by imagining a square meter pool of oil dripping slowly onto the opening credits of a film: area-fluid-above-movie-start.
I’d like to get some feedback and look into using a more sophisticated proof-of-work scheme before I propose an implementation of this. However, one further improvement comes to mind: it might make life a lot easier if I used a URI-like syntax. Something like this: