technology from back to front

Squaring Zooko’s Triangle, part two

A few days ago, in Squaring Zooko’s Triangle, I sketched out a proposal for a new naming scheme which to a limited extent achieved all of the contradictory-seeming aims set out in Zooko’s Triangle. Having discussed the essay with a few folk since, it seems like it might be worthwhile trying to clarify the ideas behind it.

(Updated 2012-04-27 to use Tahoe-LAFS’s superior terminology, and correct some related errors.)

A brief aside: Zooko’s Tetrahedron

A brief aside before we start: Zooko’s Triangle is actually a tetrahedron. There are four desirable properties of a name, and we know how to achieve any three of these four. You hear it discussed as a triangle because authors take one of the four points as so essential and non-negotiable that it isn’t even discussed, but different authors take different points as essential so it can be confusing to read multiple accounts of it. The four desiderata are

  • Specificity: one party has a better claim to the name than any other, and can establish that claim.
  • Memorability: you can remember the name if you see it on the side of a bus.
  • Transferability: you can give the name to someone else, and it means the same to them as it does do you
  • Decentralization: there is no central body to whom ultimate authority over the entire namespace is delegated

Sacrifice any one of these and there’s an existing system that provides the other three.

  • Sacrifice specificity and you get ordinary names and nicknames. I can claim to be “ciphergoth” with no authority, and you can just take in on trust from me if you like.
  • Sacrifice memorability and you get PGP key fingerprints – I can generate a key with a fingerprint any time I like, and if the system is secure, no-one else can for example sign a message with a key matching that fingerprint.
  • Sacrifice transferability, and I can locally bind a memorable name to a public key, as for example my ssh cache does.
  • Sacrifice decentralization and you get domain names, as maladministered by ICANN/Verisign.

In this framework, the [old link: https://www.zooko.com/distnames.html ] original article on Zooko’s Triangle assumes transferability is non-negotiable, Clay Shirky’s essay assumes that specificity is non-negotiable. Petnames assumes that authoritylessness is non-negotiable, and describes a system that tries to achieve all three goals by switching seamlessly between the three different kinds of names that remain.

Zooko Shirky Steigler Tahoe-LAFS
human-readable memorable memorable memorable
decentralized non-political - decentralized
secure - securely unique specific
- global global transferable

Word lists

Now back to my proposal, which attempts to achieve all four. This proposal combines three ideas towards this goal.

I definitely don’t claim that what I propose here does away with the need for systems like Petnames, by the way; it has its own disadvantages and is meant only to complement those systems.

We start with PGP key fingerprints, which look like this:

ECA1 0308 8103 0F40 3983 FC7E F971 8B0F 70C1 A697

Specific, decentralized, transferrable, and totally unmemorable.

So, the first idea is to use word lists instead of hexadecimal to improve readability. I went to Diceware for my wordlist, since that wordlist is specifically meant to provide a long list of short memorable words for cryptographic purposes. Perhaps there are other suitable lists – do let me know – but that one is certainly good enough for a demonstration of principle. I stripped out all the ones containing special characters, since they are inconvenient for a variety of reasons and it only makes the list slightly shorter. The resulting list contains 7322 words, so you can encode about 12.8 bits with each word, and 160 bits comes to 13 words:

haiti-blotch-emma-wage-dusty-galaxy-dw-skate-mid-quiz-ivan-cramp-lamb

Proof of work

Well, that’s an improvement, but it’s still a long way from what I’d call “memorable”. It might be memorable if we could cut it down to something much shorter – I aimed for five words:

haiti-blotch-emma-wage-dusty

Now that’s getting close to memorable. It’s also completely insecure. At five words, it’s encoding about 64 bits. If that’s your name, and I want to imitate you, I have to generate and hash 264 public keys before I find one that matches this one; because it’s in bulk the public key generation can be done very cheaply, so the only real cost is the hash. This is comparable to the effort involved in the distributed RC5-64 attack which was successfully completed five years ago and would be perhaps ten times easier today.

Actually, the situation is much worse than this, but I’ll come to that in a moment.

We can foil this attack by increasing the number of words, but at the cost of memorability. Instead we make this attack more expensive in a different way: we increase the cost of generating a name given a public key. One possible technique would be something like key stretching; we could subject the key to an iterated hash with many steps before generating the name from it. If we used for example 224 steps, then generating a name would take 10-20 seconds – acceptable, since generating names is rare – while the difficulty of our attack goes up to 288 steps, well outside the range of even the best funded attacker today.

Unfortunately, this approach also slows down verifying a public key given a name to 10-20 seconds, and that’s much more of a problem because that’s a frequent operation. On a small device like a phone, it would be much worse; again, you can probably sacrifice generating names on your phone but you won’t want to sacrifice verifying them. Instead, the second idea: we use a proof-of-work based scheme. A more complete proposal should probably use one of the more sophisticated proof-of-work schemes out there; I saw a great one recently involving function inversion to force a minimum memory usage on the attacker, increasing the cost of a parallel hardware attack at no cost to real users, but I can’t now find the reference. Again, for proof of concept, I’ve used a pretty simple scheme.

In this scheme, you hash the public key together with a nonce to generate the name, and the combination is only valid if the last 24 bits of the hash are all zero; then you use the first 64 bits to generate the name. You’ll have to try about 224 different nonces before you find a valid one, so name generation takes 10-20 seconds as before, but now you supply the nonce along with the public key to someone verifying your name, and they can verify the name instantly with a single hash function invocation, making sure to check that you’ve done your work and the last 24 bits are zero.

This plan has another advantage too: there’s now more than one name for each public key. That means that if the name the system gives you is a particularly unfortunate choice of words, you can just run it again, wait another 10-20 seconds, and get another one. I suspect this is pretty much an essential property of any such scheme.

Handle

We’re not quite finished, because our attacker has another trick up their sleeve: multiple-target attacks. All of the foregoing assumes they’re trying to imitate one person: you. Perhaps they want to imitate either you or your sister? Now every name they generate has two chances of matching – one against your name, and one against your sisters – and so on average it will take them half as long to find a match. What if they’re happy to match any of the 26,000 people who work at the Pentagon? What if they’re happy to match anyone in the world? In a world where public keys are in widespread use by nearly everyone, this could make their attack easier by a factor of 232 – this probably isn’t the most realistic attack scenario, but in crypto it pays to be pessimistic, and this brings the cost of the attack down to a very achievable 256 steps, even given the countermeasures described above.

Again, we could prevent that by making the name longer – another two components pushes the cost back up to over 281. However, we can make a lesser sacrifice of memorability with the third idea: we add a component to the name that you choose. Your name might be a good choice, but you can choose a pseudonym or a handle so long as you think that other people making the same choice will be reasonably rare. We hash this “handle” in along with the nonce and the public key. Now our attacker can’t attack all the names at once – only all the names with the same handle, which hopefully is a much smaller subset. Even if there are a dozen John Smiths working at the Pentagon, the gain for our attacker is vastly reduced – they’re still looking at a work factor of over 284 to hit any of them. Of course our attacker can make up as many names with this prefix as they like, but it doesn’t help them much, because they’ll just end up attacking their own names rather than the names they care about. And this big improvement in security comes at a relatively small price in memorability, since the “handle” is chosen to be meaningful to both sender and recipient.

You can improve memorability further by using a mnemonic. I’ve been able to memorize the randomly generated name I used as an example by imagining a square meter pool of oil dripping slowly onto the opening credits of a film: area-fluid-above-movie-start.

I’d like to get some feedback and look into using a more sophisticated proof-of-work scheme before I propose an implementation of this. However, one further improvement comes to mind: it might make life a lot easier if I used a URI-like syntax. Something like this:

zooko://area-fluid-above-movie-start/Paul Crowley
by
Paul Crowley
on
21/11/07
  1. The function inversion scheme you were trying to remember was Penny Black out of microsoft research.

  2. The function inversion scheme you were trying to remember was Penny Black out of microsoft research.

    One problem that I see in using a series of mnemonics for the handle is that they are not really “memorable.” They seem to be a small step above a fingerprint, but while you can make a case that it is easy for you to remember you own mnemonic I think it is harder to claim that others will be able to remember your mnemonic. How hard would the attack be if I only had to find mnemonics that were close enough to yours that they would fool a casual observer? Think about the early versions of phishing that used sound-alikes and international charsets (hmmm… that is another issue: your wordlist is english, how can I remember a handle wordlist if this list is internationalized or if the list is language-specific how can we deal with recognizing wordlists from a language which is not one you speak/read?)

  3. I was reminded of this post today, and I still have no difficulty remembering the “area-fluid-above-movie-start” example I used at the start – and that was genuinely randomly chosen, though I rejected about a dozen candidates to get it.

  4. Raoul Duke
    on 07/02/09 at 12:54 am

    @”How hard would the attack be if I only had to find mnemonics that were close enough to yours that they would fool a casual observer?”

    that always bugs me, and i was glad to come across other people thinking about it.

    http://freeworld.thc.org/papers/ffp.html#tth_sEc2.3

  5. I’d suggest a different word from “authorityless”. The other words are properties of the individual names in a naming system, whereas “authorityless” is a property of the entire system itself.

    Indeed, unguessable URLs may carry authority and, hence, calling them “authorityless” is confusing to some.

    I’d suggest “CentrallyAdministered” or something else along those lines.

  6. Toby – agreed, another name might be desirable. BTW here’s a handy guide to what everyone means by the different terms they use:

    Zooko Shirky Steigler Me
    human-readable memorable memorable memorable
    decentralized non-political - authorityless
    secure - securely unique secure/unique
    - global global transferable
  7. This is our preferred terminology:

    a) memorable
    b) decentralized
    c) specific
    d) transferable

    Rationale:

    a) it’s not enough for a name to be readable; memorability by a typical human is what is important. We’d suggest “human-memorable” but that’s probably clear enough from just “memorable”.

    b) “non-political” may be an indirect consequence of the property, but is not the property itself. “authorityless” is confusing about precisely what lacks authority. (It’s not the name; is it the ability to determine the binding? But the ability to determine even a local binding is an authority.) “[Not] centrally administered” as suggested by Toby means essentially the same thing as “decentralized”, but the latter is more concise and is the term used by Zooko.

    c) “secure” is too vague. The security property that is meant is that a name always designates a specific object — the one specified by its creator. Tyler Close calls this the “Y-property”, but I think that’s too jargony. A name with the unique reference property can be called a “lambda name”, but that often also implies that the name is local to a given scope.

    (BTW, your description “one person has a better claim to the name than any other, and can establish that claim”, assumes that the name is designating a person, which is just a special case.)

    d) “global” is an overspecification: it implies that everyone knows the name, when in fact what is meant is that anyone who knows the name can use it (consider a sparse capability, for instance). “transferable” captures that pretty well.

  8. Rereading Zooko’s original article, it uses “human-meaningful” as opposed to “memorable”. These mean significantly different things: your diceware suggestion produces names that may be memorable (that’s somewhat debatable, but let’s assume they can be), but are definitely not meaningful. That is, a they don’t have any meaning in reference to a particular name-object binding.

    We feel a blog post coming on :-)

  9. Oh! It’s possible to have any 4 of 5 properties. That’s way cool.

  10. For those wondering, it took me a moment to get this:

    Human meaningful, but not human memorable (assuming all others):

    “google.com – a popular search engine with sometimes funky pictures, founded by … (Sufficiently different and unique text)”

    Human memorable, but not meaningful (assuming all others):

    “apple-turnip-swamp-movie-teacher”

    One person suggested arbitrary for the combination.

    I would argue that your name is not fully meaningful, and hence that is the point it is removing.

 
 


× 9 = twenty seven

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us