technology from back to front

Archive for November, 2007

Emacs in MacOS X 10.5 Leopard

If you’ve upgraded to Leopard, and you’re an Emacs user, you may have found that typing emacs in a term no longer works — you get “Fatal malloc_jumpstart() error”. This will be the case if you’re using something like Fink or Darwin Ports, but it may also be the case if the upgrade didn’t quite work out.

The builds in Fink and Darwin Ports are currently broken; but otherwise, you have a few options to get Emacs back:

Use the Emacs that comes with Leopard. All this solution involves is making sure that any custom binary isn’t on the path — remove or move /sw/bin/emacs or /opt/local/bin/emacs (or possibly even /usr/local/bin/emacs) and make sure /usr/bin/emacs is the one installed with Leopard — /usr/bin/emacs --version should tell you the version is “22.1.1″.

The problem with the Emacs shipped with Leopard is that it’s built with no X support (or Carbon, for that matter)(although it does have Carbon support — see below).

Use Aquamacs. Aquamacs has precompiled binaries which work with Leopard. But, beware, it is quite different from Emacs in some ways — if you used Emacs before you used MacOS X, you probably won’t like it.

Compile your own Emacs. This is what I did. The stock 22.1 source distribution won’t build in Leopard; it gives up the ghost with a message “Assertion failed: (filesize <= ranges->size)”. However there is an unofficial patch, which worked for me. I’m going to keep my custom emacs until fink or ports catches up.

In summary (adapted slightly from here):

curl -O
tar xvzf emacs-22.1.tar.gz
cd emacs-22.1
curl -O
patch -p0 < emacs-leopard-unexec.patch
./configure --without-carbon --with-x --prefix=/usr/local
sudo make install

I put the result in /usr/local and made sure /usr/local/bin was on my path before any of the other possibilities, while I wait for Fink or Ports to catch up.

UPDATE: Thanks to masklinn's comment below, I can present another option which I'd overlooked:

Use carbonised Emacs. If you want to run Emacs in its own window, but don't want X11, there is a ready-made .app wrapper in the source distribution. The disk image, which masklinn suggests, has patches and some very useful bundled Elisp packages, like nxml-mode.

You can also use the emacs distributed with Leopard with the .app wrapper, as jfb points out (see the second comment below); or, compile your own carbonised Emacs, similar to what I did above: change the configure invocation

./configure --with-carbon --without-x
cp -R mac/ ./
cp src/emacs
sudo mv /Applications/

Either way gives you an Emacs that behaves like a MacOSX app and doesn't need X11 to run.


Squaring Zooko’s Triangle, part two

A few days ago, in Squaring Zooko’s Triangle, I sketched out a proposal for a new naming scheme which to a limited extent achieved all of the contradictory-seeming aims set out in Zooko’s Triangle. Having discussed the essay with a few folk since, it seems like it might be worthwhile trying to clarify the ideas behind it.

(Updated 2012-04-27 to use Tahoe-LAFS’s superior terminology, and correct some related errors.)

A brief aside: Zooko’s Tetrahedron

A brief aside before we start: Zooko’s Triangle is actually a tetrahedron. There are four desirable properties of a name, and we know how to achieve any three of these four. You hear it discussed as a triangle because authors take one of the four points as so essential and non-negotiable that it isn’t even discussed, but different authors take different points as essential so it can be confusing to read multiple accounts of it. The four desiderata are

  • Specificity: one party has a better claim to the name than any other, and can establish that claim.
  • Memorability: you can remember the name if you see it on the side of a bus.
  • Transferability: you can give the name to someone else, and it means the same to them as it does do you
  • Decentralization: there is no central body to whom ultimate authority over the entire namespace is delegated

Sacrifice any one of these and there’s an existing system that provides the other three.

  • Sacrifice specificity and you get ordinary names and nicknames. I can claim to be “ciphergoth” with no authority, and you can just take in on trust from me if you like.
  • Sacrifice memorability and you get PGP key fingerprints – I can generate a key with a fingerprint any time I like, and if the system is secure, no-one else can for example sign a message with a key matching that fingerprint.
  • Sacrifice transferability, and I can locally bind a memorable name to a public key, as for example my ssh cache does.
  • Sacrifice decentralization and you get domain names, as maladministered by ICANN/Verisign.

In this framework, the [old link: ] original article on Zooko’s Triangle assumes transferability is non-negotiable, Clay Shirky’s essay assumes that specificity is non-negotiable. Petnames assumes that authoritylessness is non-negotiable, and describes a system that tries to achieve all three goals by switching seamlessly between the three different kinds of names that remain.

Zooko Shirky Steigler Tahoe-LAFS
human-readable memorable memorable memorable
decentralized non-political - decentralized
secure - securely unique specific
- global global transferable

Word lists

Now back to my proposal, which attempts to achieve all four. This proposal combines three ideas towards this goal.

I definitely don’t claim that what I propose here does away with the need for systems like Petnames, by the way; it has its own disadvantages and is meant only to complement those systems.

We start with PGP key fingerprints, which look like this:

ECA1 0308 8103 0F40 3983 FC7E F971 8B0F 70C1 A697

Specific, decentralized, transferrable, and totally unmemorable.

So, the first idea is to use word lists instead of hexadecimal to improve readability. I went to Diceware for my wordlist, since that wordlist is specifically meant to provide a long list of short memorable words for cryptographic purposes. Perhaps there are other suitable lists – do let me know – but that one is certainly good enough for a demonstration of principle. I stripped out all the ones containing special characters, since they are inconvenient for a variety of reasons and it only makes the list slightly shorter. The resulting list contains 7322 words, so you can encode about 12.8 bits with each word, and 160 bits comes to 13 words:


Proof of work

Well, that’s an improvement, but it’s still a long way from what I’d call “memorable”. It might be memorable if we could cut it down to something much shorter – I aimed for five words:


Now that’s getting close to memorable. It’s also completely insecure. At five words, it’s encoding about 64 bits. If that’s your name, and I want to imitate you, I have to generate and hash 264 public keys before I find one that matches this one; because it’s in bulk the public key generation can be done very cheaply, so the only real cost is the hash. This is comparable to the effort involved in the distributed RC5-64 attack which was successfully completed five years ago and would be perhaps ten times easier today.

Actually, the situation is much worse than this, but I’ll come to that in a moment.

We can foil this attack by increasing the number of words, but at the cost of memorability. Instead we make this attack more expensive in a different way: we increase the cost of generating a name given a public key. One possible technique would be something like key stretching; we could subject the key to an iterated hash with many steps before generating the name from it. If we used for example 224 steps, then generating a name would take 10-20 seconds – acceptable, since generating names is rare – while the difficulty of our attack goes up to 288 steps, well outside the range of even the best funded attacker today.

Unfortunately, this approach also slows down verifying a public key given a name to 10-20 seconds, and that’s much more of a problem because that’s a frequent operation. On a small device like a phone, it would be much worse; again, you can probably sacrifice generating names on your phone but you won’t want to sacrifice verifying them. Instead, the second idea: we use a proof-of-work based scheme. A more complete proposal should probably use one of the more sophisticated proof-of-work schemes out there; I saw a great one recently involving function inversion to force a minimum memory usage on the attacker, increasing the cost of a parallel hardware attack at no cost to real users, but I can’t now find the reference. Again, for proof of concept, I’ve used a pretty simple scheme.

In this scheme, you hash the public key together with a nonce to generate the name, and the combination is only valid if the last 24 bits of the hash are all zero; then you use the first 64 bits to generate the name. You’ll have to try about 224 different nonces before you find a valid one, so name generation takes 10-20 seconds as before, but now you supply the nonce along with the public key to someone verifying your name, and they can verify the name instantly with a single hash function invocation, making sure to check that you’ve done your work and the last 24 bits are zero.

This plan has another advantage too: there’s now more than one name for each public key. That means that if the name the system gives you is a particularly unfortunate choice of words, you can just run it again, wait another 10-20 seconds, and get another one. I suspect this is pretty much an essential property of any such scheme.


We’re not quite finished, because our attacker has another trick up their sleeve: multiple-target attacks. All of the foregoing assumes they’re trying to imitate one person: you. Perhaps they want to imitate either you or your sister? Now every name they generate has two chances of matching – one against your name, and one against your sisters – and so on average it will take them half as long to find a match. What if they’re happy to match any of the 26,000 people who work at the Pentagon? What if they’re happy to match anyone in the world? In a world where public keys are in widespread use by nearly everyone, this could make their attack easier by a factor of 232 – this probably isn’t the most realistic attack scenario, but in crypto it pays to be pessimistic, and this brings the cost of the attack down to a very achievable 256 steps, even given the countermeasures described above.

Again, we could prevent that by making the name longer – another two components pushes the cost back up to over 281. However, we can make a lesser sacrifice of memorability with the third idea: we add a component to the name that you choose. Your name might be a good choice, but you can choose a pseudonym or a handle so long as you think that other people making the same choice will be reasonably rare. We hash this “handle” in along with the nonce and the public key. Now our attacker can’t attack all the names at once – only all the names with the same handle, which hopefully is a much smaller subset. Even if there are a dozen John Smiths working at the Pentagon, the gain for our attacker is vastly reduced – they’re still looking at a work factor of over 284 to hit any of them. Of course our attacker can make up as many names with this prefix as they like, but it doesn’t help them much, because they’ll just end up attacking their own names rather than the names they care about. And this big improvement in security comes at a relatively small price in memorability, since the “handle” is chosen to be meaningful to both sender and recipient.

You can improve memorability further by using a mnemonic. I’ve been able to memorize the randomly generated name I used as an example by imagining a square meter pool of oil dripping slowly onto the opening credits of a film: area-fluid-above-movie-start.

I’d like to get some feedback and look into using a more sophisticated proof-of-work scheme before I propose an implementation of this. However, one further improvement comes to mind: it might make life a lot easier if I used a URI-like syntax. Something like this:

zooko://area-fluid-above-movie-start/Paul Crowley
Paul Crowley

Joe Armstrong on multicore

Joe Armstrong, the inventor of Erlang, paid LShift a visit on Friday. He had kindly agreed to give a short talk to a few of my colleagues. We ended up cramming about twenty people into our meeting room, listening to Joe explain the implications of multicore CPU architectures for programming language design. There were lots of questions from the audience and some interesting discussions, keeping us a all occupied for nearly two hours. Matthew Sackman has posted his thoughts on some of the key points.

We covered a range of other topics too, including Joe’s recent idea of an Erlang/OTP Service Pack. This will be on a shorter release cycle than the main Erlang/OTP distribution, allowing changes and new features to be brought to a wide audience more quickly, with the best bits, hopefully, eventually making it into OTP.


Astral Plane characters in Erlang JSON/RFC4627 implementation

Sam Ruby examines support for astral-plane characters in various JSON implementations. His post prompted me to check my Erlang implementation of rfc4627. I found that for astral plane characters in utf-8, utf-16, or utf-32, everything worked properly, but the RFC4627-mandated surrogate-pair “\uXXXX” encodings broke. A few minutes hacking later, and:

Eshell V5.5.5  (abort with ^G)
1> {ok, Utf8Encoded, []} =
2> xmerl_ucs:from_utf8(Utf8Encoded).
3> rfc4627:encode(Utf8Encoded).

Much better.

You can get the updated code from


Intel Low Latency Labs Tests RabbitMQ

Intel Low Latency Labs this week announced [old link:] the results of tests they performed on a messaging system built on the RabbitMQ implementation of AMQP, built by LShift and partners CohesiveFT, on Intel hardware running CohesiveFT’s Elastic Servers. This configuration achieved a throughput of 1.3 million messages a second.



[Extract From]

London, United Kingdom, Nov 14, 2007 – The quest for greater speed and lower latency trading in the financial services sector is set for a major boost due to a new initiative from Intel® Solution Services, the Intel Low Latency Trading lab. Using non-proprietary, standards-based technologies is already known to reduce maintenance and integration costs. However, solutions architects at Intel’s Low Latency Lab in London, have shown that optimising financial messaging for Intel server technologies such as Intel® I/O Acceleration Technology 2 (Intel I/OAT2) is also capable of delivering greater trading performance on major financial messaging technologies including Options Price Reporting Authority (OPRA) feed, Financial Information eXchange (FIX) Protocol Limited’s FAST data compression and the Advanced Message Queuing Protocol (AMQP) protocol over TCP/IP for message transport.

Tests were conducted with Pantor Engineering for market data expertise, Cohesive FT’s Elastic Servers running a RabbitMQAMQP cluster, Redhat RHEL5, Endace time-stamping latency measurement tools on the latest Intel® quad-core server platforms including a 16-core Intel® Xeon® Processor 7300 series processor-based machine, as well as an 8-core server based on Intel’s newly released Intel Xeon Processor 5400 series.

Using industry standard hardware, an OPRA feed input rate of 1.3 million messages per second was distributed and replicated to four concurrent subscribers, via an AMQP 0-8 compliant broker cluster running on a single Intel Xeon Processor 7300 based server, with measured mean latency below 1.5ms. Intel I/OAT2 interrupt throttle settings gave a circa 5% uplift to performance, with minor effect on CPU load.

is a product implementing AMQP to deliver a platform-neutral, complete, unified and highly scalable business messaging. It is developed and commercially supported by Rabbit Technologies Ltd, a joint venture between CohesiveFT and LShift Ltd. Additional information on the product and its management and integration tools is available here and, for enquiries, by email sent here.

delivers Elastic Server On Demand, a product that delivers value by reducing the complexity of technology by enabling rapid provisioning and management of applications without platform lock-in. Leveraging virtualisation, open standards and its partner ecosystem working with Intel, CohesiveFT opens up technology choices for the front office and lower barriers to market participation.

, the world leader in silicon innovation, develops technologies, products and initiatives to continually advance how people work and live. Additional information about Intel is available at the Intel Pressroom and Intel Blogs


Squaring Zooko’s triangle

(Note: a more detailed explanation and justification can be found in Squaring Zooko’s Triangle, part two)

Zooko’s Triangle [old link: ] encompasses the three properties you’d like a name to have:

  • secure given the name, you can securely find the person it belongs to in order to, say, check their cryptographic signature on an email
  • memorable you should be able to see a name on a poster, and recognise it later when you come to use it
  • decentralized there should be no central naming authority who can screw things up the way that ICANN/Verisign inevitably do

The thesis behind the triangle is that of these three desirable properties, you can have any two. If you’re not familiar with Zooko’s Triangle, Clay Shirky‘s essay is the clearest explanation (he independently rediscovered the same idea), and Marc Stiegler‘s essay on petnames illustrates it and discusses one solution.

In this essay I propose a kind of name which is entirely decentralized, reasonably secure, and at least somewhat memorable. They look like this:


The name is in two parts: the handle h (“Paul_Crowley”) and the tag t (“area-fluid-above-movie-start”). Associated with a particular name will be a public key k and a 32-bit nonce n. I can print the name on my business cards, and publish the key and nonce on my website, and you can securely verify using the nonce that the key belongs to the name.

The scheme is based on a fixed, globally shared word list. I got a possible list using the Beale list from Diceware, which is designed for a different but related purpose, and removing all the words that contained special characters; the resulting list contains 7322 words. To fill out the details on this scheme, we’ll also need to fix a hash function and an input encoding for the hash function; SHA-256 and SPKI-like S-expressions seem plausible choices.

Given a name h:t as above, a key k, and the nonce n as above, you verify the match as follows: hash three of these together to generate v = H((h, k, n)). Now verify that the last 23 bits of v are all zero. Finally, treat the first 65 bits as five groups of 13 bits, and use each group of 13 bits as an index into the word list; ensure the resulting 5 words match the five words in the tag.

To generate yourself a name, you’ll need a public key k and a handle h; you should choose a handle that you don’t think many other people will have chosen. Repeatedly try lots of random values for the nonce n until you find one which results in a hash which is zero in the last 23 bits, and for which each of the five groups of 13 bits yields an index small enough that it doesn’t point off the end of the word table (since they can yield 8192 different values but there are only 7322 words in the table). On a standard PC it should take about 10-20 seconds to find a suitable nonce. Of course, you may not like the resulting tag (“paulcrowley:hitch-marty-grail-mild-coot”) in which case you can keep searching for a nonce you like better.

It’s clear that this is decentralized, and you can assess the memorability for yourself; the example I’ve used is randomly generated, though I had a few goes to find one I liked. How is it for security?

Let’s suppose our attacker wants to pretend to be me. They must find a (h, k’, n’) such that H((h, k, n)) matches H((h, k’, n’)) in 88 bits: the 65 bits at the start that make up the tag, and the 23 zero bits at the end. Assuming there’s no way of exploiting some useful property of the hash function, this will take about 288 trials, far outside the reach of even the best-funded attacker.

Our attacker might like to avail themselves of a multiple-target attack (as I’ve described earlier). However, they can only simultanously attack all those with the same handle, which if you’ve chosen your handle well will be few.

This makes sacrifices on both the memorability and security fronts, but it’s the closest thing I’ve seen to something that achieves all three goals.

(Continued in Squaring Zooko’s Triangle, part two)

Paul Crowley

NDocProc updated for C# 2.0 with Generics

NDocProc, our small and simple .NET javadoc-like documentation generator, has been updated for C# 2.0 and Generics.

Changes since the previous announcement include:

  • Support for generics, nested types, arrays, delegates, events, and the intersection of them all.
  • Now requires .NET 2.0 or Mono 2.0 to build.
  • Support for slightly more of the XML documentation language used in .NET documentation comments.
  • Better formatting of namespace pages, and summaries on the main index page.

The project homepage is here. You can download a zip of the latest version, or use mercurial to check out the repository:

hg clone



You are currently browsing the LShift Ltd. blog archives for November, 2007.



2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us