technology from back to front

Archive for June, 2006

Thermostat code release

I’ve released my simple fan control program described in this entry (see also part one).

THIS CODE MAY MELT YOUR CPU – download only if you plan to read it, test it, and/or hack on it. The license makes it clear that it comes with no warranty.

I’ve already received an interesting email from Mark M Hoffman in reply to my post to the mailing list announcing it, drawing my attention to PID control loops. Looks like it could be a worthy avenue of investigation. I’ve replied with a description of one issue I see with applying a PID controller in this domain.

Paul Crowley

‘Code as Data’ != Closures

Today I googled for “code as data”. The first hit that came up was [this](,+or+closures), a tutorial on [Groovy]( that cheerfully proclaims that what is meant by “code as data” is *closures*! No, no, NO!!!!

“Code as data”, aka “code is data” signifies the ability to *manipulate*
code, i.e. to construct and take apart programs. At the most primitive level that can be accomplished
by representing source code as strings, which is possible in most programming languages. Going beyond that, many
programming languages define an AST data structure, with a parser and
pretty printer, that allows manipulation of code in a more structured
manner, enforcing most, if not all, syntactic constraints of the

However, neither of these approaches fully capture what is meant by
“code is data” in the Lisp tradition. There code really *is*
data. There is no special AST data type and associated parser and
pretty printer. Instead all programs are represented in terms of the
ordinary data types of the language, such as symbols and lists. The
concrete syntax of programs is subsumed by the standard external
representation of data.

None of this has anything to do with closures.


Making the client do all the work

This paper proposes to reduce the workload of SSL servers by making the clients carry as much of the crypto-related load as possible. I think it’s possible to do even better.

Key agreement: In the above, the server only has to do an RSA public key operation, which is cheap if the exponent is low (eg three). However, we can do even better (and have a stronger security bound too) by using the Rabin operation – modular squaring – instead. This is more than twice as fast as the RSA operation with exponent three. Normally, Rabin encryption is slowed down by certain steps that are needed to handle the fact that modulo a large semi-prime, most numbers that have square roots have four of them, and the recipient has to know which one you mean. However, modern KEM schemes greatly reduce this cost, and Rabin-KEM encryption is just about the fastest public key operation I know of, with the exception of certain probabalistic signature checking schemes.

Signatures: a trick has been missed here. Modern “one-time” signature schemes (eg “Better than BiBa“) can actually sign many messages before the private key must be discarded for security, which in an online/offline signature scheme greatly reduces the number of documents to be signed. For even greater efficiency, a hash tree can be used to sign many one-time keys simultaneously. At the cost of imposing a small latency on all clients, we can even discard the one-time signatures, avoiding a patent, and directly use hash trees; as many clients try to connect, the server can place the documents to be signed in a hash tree and sign them all with one operation. This scheme scales very nicely: the server performs its public key operation at a constant rate of, say, ten per second, and no matter how many clients are trying to connect these signatures will serve to authenticate the server to them all. The clients may have to wait an extra tenth of a second for the connection to complete, but this cost will be small in the cost of connecting to a busy server.

Client puzzles I’m not sure I understand why mixing up the client puzzle step and the public key generation step is beneficial.

With this scheme, the server only has to do one modular squaring per client – and even that only when the client has proven its worth by solving a client puzzle. I wonder if it’s possible to do even better?

Paul Crowley

Thermostat defeat

When I started on this, I thought I’d be able to dash off a script to keep my CPU fan quiet in a few hours. I’ve just spent far too much of this weekend obsessively hacking on it and testing it, and after creating a tool of great sophistication, I have basically given up in defeat. I’m now using a thermostat-like approach; either the fans are on full or on minimum, nothing in between.

First installment of the saga

Why didn’t I just do that in the first place? Because it’s quite distracting when the fan suddenly gets louder or quieter, so if your system is regularly switching between the two to keep itself at the right temperature, it could be more irritating than having the fan on full blast all the time. It’s much more desirable to find just the right constant speed to maintain temperature, and stay there.

So I wrote a proper feedback loop which measures the temperature and makes constant small adjustments to the fan speed to keep it in range. I even wrote a sophisticated and effective scheme for mapping fan power to fan speed, which learned what fan power to use to reach a specific speed, which learned where the fan stopped and avoided it, and learned how to get it restarted if it stopped. But I’ve abandoned that now.

The first problem with this approach is that when you change the fan speed, it takes too long to find out what the result will be. As I explained in the previous article, for a given workload and fan speed the system will eventually find an equilibrium temperature. But it turns out that it can take several minutes to get close to that equilibrium temperature. Only when you’ve reached equilibrium can you really know what adjustment you should be making to get the fan speed right next. If you try to make the adjustments early, you’ll get exactly the loop I thought wouldn’t be a problem in the previous article: the fan speed will zoom from too high to too low and back.

If the fan temperature is too high, you can’t afford to wait minutes; the CPU could be damaged. So to do this safely, you have to start with the fan at maximum, then very slowly bring it down until the CPU temperature reaches the target. If the temperature goes above a safe threshold at any point, you must then turn the fans on to maximum and start all over again. If you manage to avoid this fate, you should find the correct fan speed within about quarter of an hour.

Unless, of course, the CPU workload changes during that time. This brings us to the second problem with this approach: CPU workloads generally change much faster than that. Long before you’ve adjusted for the current workload, the system will have moved to another.

Actually, that’s not quite true; on my system at least there are two circumstances where the same workload is sustained for long periods of time. That is when it is basically idle, and when it is working flat out. When it is basically idle, even the slowest fan speed is enough to keep the CPU at well within specified temperature; in fact, with CPU frequency scaling it never goes above 41 C. For these purposes decoding video (ie watching TV) seems to count as “basically idle”. When it is working flat out, the fan needs to work flat out to keep it at 61 C, which is as high as I’m really happy to go.

Thus even if the script could magically determine the exact right fan speed for the work the system is doing and skip to it, it would still usually be shifting straight from ticking over to flat out as the workload changed. This is the third problem; at least on my system workloads calling for a speed which is neither minimum nor maximum seem never to happen at all.

Once I’d painfully determined all this, it was clear that my amazingly sophisticated script could be replaced by a simple thermostat with no real loss in function. If the temperature goes above the high threshold, the fan kicks into full power. Once it drops below the low threshold, it spins down to its minimum speed. It turns out that this desperately simple scheme will do the right thing in all the circumstances I’ve observed so far.

I also got rid of the wrapper/script structure I used to have. The thermostat is barely more complex than the wrapper – simpler once you take away the job of managing a child process – so I did away with the Python script and wrote the whole thing in C, which also means you don’t have to install Python to use it. It’s currently under 250 lines of code. After a little bit of cleaning up, this is what I will submit to lm-sensors.

If I ever observe dithering – frequent spinning up and slowing down – on any system, I can start to think about how to combat it. I currently think that if the more sophisticated approach is to be made to work, it must directly observe the CPU load – and, for systems with CPU frequency scaling, the current CPU frequency – in order to know how to respond to it. It will have to keep a record of how the temperature changes under different loads, at different CPU frequencies, with different fan speeds, at different CPU temperatures and different case temperatures, and from each datapoint infer a constantly-updated model of what the right fan speed for the current conditions is. It will have to cope with the imprecisions of all the measurements – one big problem is that at least on my system temperature is measured in whole degrees, giving us only the coarsest-grained information on how it is changing – to ever refine its picture of how to give the best results. And it must do so reliably, without endangering the CPU with dangerously high temperatures and while minimizing the speed of CPU temperature changes to prolong its life.

I might come back to it one day if the need ever arises. For now my 250-line thermostat will do me fine.

Paul Crowley

Static analysis of Erlang communication

I had a brief email exchange with the developers of
[Dialyzer](, the
static analyzer (some might call it a type checker) for
[Erlang]( programs. Currently Dialyzer only
performs analysis on the functional fragment of Erlang and I was
enquiring whether to extend that to handle communication. That would
allow the detection of basic input/output mismatches, e.g. when a
message is sent to a process that does not match any of the patterns
it is willing to receive.

Going further, one might be able to employ the various techniques
developed for process algebras to reason about the concurrent
behaviour of Erlang programs and, for example, detect deadlocks and
enforce information flow security properties. A good example of such a
tool is [TyPiCal]( It
would be amazing to have something like that for Erlang. After all,
what makes Erlang interesting is not the functional programming aspect
currently checked by Dialyzer, but its support for concurrency,
distribution and fault-tolerance. It is incredibly difficult to
correctly implement systems that involve the latter. If there is any
area of programming in which we want the help of static analysis then
this is it!

Anyway, it turns out that there are no immediate plans to extend
Dialyzer in that direction. However, I was pointed at some related
research that I had hitherto been unaware of: [Karol Ostrovský's PhD
thesis](, which in
Part II describes the

* sound instantiation of Kobayashi’s generic type system for the
pi-calculus to session types,

* extension of session types to multi-session types (which, afaict,
handle sessions that involve *asynchronous* comms, and servers that
handle multiple sessions *without spawning*),

* application of multi-session types to type check communication of
Erlang processes.

Overall this looks like a promising attempt at constructing a
process-algebra-based type system that is decidable and yet expressive
enough to reason about non-trivial real-world protocols (IMAP4 is used
as an example). The theory behind it seems to be quite involved, but
that could just be due to the presentation format – a thesis rather
than a paper. It will be interesting to see whether this research is
carried any further and eventually materialises in tools for Erlang.


E4X: I want my S-expressions back

E4X is a new ECMA standard (ECMA-357) specifying an extension to ECMAScript for streamlining work with XML documents.

It adds objects representing XML to ECMAScript, and extends the syntax to allow literal XML fragments to appear in code. It also supports a very XPath-like notation for use in extracting data from XML objects. So far, so good – all these things are somewhat useful. However, there are serious problems with the design of the extension.

If E4X objects were real objects, if there were a means of splicing a sequence of child nodes into XML literal syntax, and if E4X supported XML namespace prefixes properly, most of my objections would be dealt with. As it stands, the overall verdict is “clunky at best”.

These are my main complaints:

* It doesn’t do anything like Scheme’s unquote-splicing, and so using E4X to produce XML objects is verbose, error-prone and dangerous in concurrent settings.

There seems to be no way of splicing in a sequence of items – I’d like to do something like the following:

  function buildItems() {
   return [<item>Hello</item>,
   var doc = <mydocument>{buildItems()}</mydocument>;

and have doc contain


What actually results is the more-or-less useless


The closest I can get to the result I’m after is

   function buildItems(n) {
      n.mydocument += <item>Hello</item>;
   n.mydocument += <item>World!</item>;
   var doc = <mydocument></mydocument>;

* It’s full of redundant redundancy – it’s as verbose as XML, when you can do so much better.

* There’s no toXML() method (or similar) for use in papering over the yawning chasm between the XML objects and the rest of the language: you can’t even make a Javascript object able to seamlessly render itself to XML.

* The new types E4X introduces aren’t even proper objects – they’re a whole new class of primitive datum!

* Because they’re not proper objects, you can’t extend the system. You ought to be able to implement to an interface and benefit from the language’s XPath searching and filtering operations. E4X is so close to offering a comprehension facility for Javascript, but it’s been short-sightedly restricted to a single class of non-extensible primitives.

* You can’t even construct XML tags programmatically! If the name of the tag doesn’t appear literally in your Javascript code, you’re out of luck (unless you resort to eval…) [[Update: I was wrong about this - you can write <{expr}> and have the result of evaluating expr substituted into the tag.]]

* E4X XML objects have no notion of namespace prefixes (which are required for quality implementations of XPath and anything to do with XML signatures). Prefixes only turn up in the API as a means of producing (namespaceURI,localname) pairs. This is actually how it should be, but because there’s already broken software out there
that depends on prefix support, by not supporting prefixes properly you preclude ECMAScript+E4X from being used for XML signatures or ECMAScript-native XPath implementations.

In my opinion, E4X violates several programming language design principles: most importantly, those of regularity, simplicity and orthogonality, but also preservation of information, automation and structure. SXML, perhaps in combination with eager comprehensions, provides a far superior model for producing and consuming XML. Sadly, there’s no real alternative for ECMAScript yet – we’re limited either to library extensions, or to using the DOM without any syntactic or library support at all.


Bruce J. MacLennan’s Programming Language Design Principles

As far as I can tell, no-one on the web has yet summarised in a single page all of the design principles MacLennan develops in his excellent Principles of Programming Languages (2nd edition, 1986, ISBN 0-03-005163-0): so here they are. I hope they’re as useful to others as they have been to me.

* Abstraction: Avoid requiring something to be stated more than once; factor out the recurring pattern.
* Automation: Automate mechanical, tedious, or error-prone activities.
* Defense in Depth: Have a series of defences so that if an error isn’t caught by one, it will probably be caught by another.
* Information Hiding: The language should permit modules designed so that (1) the user has all of the information needed to use the module correctly, and nothing more; and (2) the implementor has all of the information needed to implement the module correctly, and nothing more.
* Labeling: Avoid arbitrary sequences more than a few items long. Do not require the user to know the absolute position of an item in a list. Instead, associate a meaningful label with each item and allow the items to occur in any order.
* Localised Cost: Users should only pay for what they use; avoid distributed costs.
* Manifest Interface: All interfaces should be apparent (manifest) in the syntax.
* Orthogonality: Independent functions should be controlled by independent mechanisms.
* Portability: Avoid features or facilities that are dependent on a particular machine or a small class of machines.
* Preservation of Information: The language should allow the representation of information that the user might know and that the compiler might need.
* Regularity: Regular rules, without exceptions, are easier to learn, use, describe, and implement.
* Security: No program that violates the definition of the language, or its own intended structure, should escape detection.
* Simplicity: A language should be as simple as possible. There should be a minimum number of concepts, with simple rules for their combination.
* Structure: The static structure of the program should correspond in a simple way to the dynamic structure of the corresponding computations.
* Syntactic Consistency: Similar things should look similar; different things different.
* Zero-One-Infinity: The only reasonable numbers are zero, one, and infinity.


Overview of Javascript modes for Emacs has a nice roundup of the (apparently only) four javascript-mode implementations for Emacs. I went for number three, Karl Landström’s javascript.el, and it’s been working very well.


Adapting C3 Linearization for Java

One of the interesting issues in
implementing dynamic dispatch for
is that the basic C3 linearization algorithm isn’t a very good
fit for the complexities of Java’s subtyping. (Note: the following
paragraphs rely on the reader having a basic understanding of the
details of C3 linearization.)

Java lists a class’s implemented interfaces separately from its
superclass. C3 requires, for each class, a list of direct superclasses
as input – so to adapt it for use with Java, we have to choose how to
combine each class’s superclass with its implemented interface list:
for instance, the superclass could be placed at the beginning or at
the end of the interface list.

In Java any interface is assignable to Object. So for C3 to make
sense, all interfaces which have no super-types have Object as a
super-type. If Object is in the super-type list of a type, it must be
the last thing – otherwise linearization will always be impossible. So
the most obvious thing to do is to include the super-class last in the
list of super-types.

This mostly works, but for the way collections from
java.util are implemented: The various abstract
collections implement their corresponding interface, but the actual
implementations don’t directly implement the corresponding interface,
so for example AbstractSet implements Set,
and HashSet extends AbstractSet, but does
not implement Set directly. This is a common pattern.

If, then, you always choose to put the super-class at the end of the
list of super-types while performing linearization for C3, this
results in an inconsistent linearization for all of Java’s built in

So I ended up doing the following by default: For any super-class
other than Object, the super-class goes first in the list of
super-types. If the super-class is Object, it gets pushed to the
end. This works in an intuitive way in lots of cases, and the
implementation supports pluggable ordering, should you need to do
something different.


Multimethods for Java

Dynamic dispatch is a mechanism for selecting a method based on the
runtime types of the parameters supplied. Java dispatches instance
methods dynamically, using the runtime type of the receiver to choose
the code to invoke and ignoring the types of the other parameters
(just like Python and many other object-oriented languages). This is
called single dispatch. Unfortunately, Java’s dispatching is limited
in two important ways: it doesn’t allow class extensions, and
it doesn’t support multiple dispatch, as implemented in many
other object-oriented languages.

I have written some code which uses reflection and proxy generation to
conveniently implement dynamic multiple dispatch for Java. It uses href=””>C3
linearization to determine the method to invoke. This algorithm
was originally devised for Dylan. You can get the distribution here.

This implementation supports subtyping of both arrays and primitive
Java types such as int, byte,
char, but does not yet support Java 1.5′s generics. The
subtyping relation for arrays and primitive types is based on Java’s
notion of assignability – see the documentation for details.

Here’s a trivial example of using the dynamic dispatch library:


  // define an interface
  public interface NumberPredicate {
      public boolean evaluate(Number n);

  // implement it for some argument types
  public class Exact {
      public boolean evaluate(Float f)  { return false; }
      public boolean evaluate(Double f) { return false; }
      public boolean evaluate(Number n) { return true; }

  // create a dynamic dispatcher
  NumberPredicate exact = (NumberPredicate)
    DynamicDispatch.proxy(NumberPredicate.class, new Exact());

Now, code making use of exact can call the
evaluate method with a Number instance, and
the DynamicDispatch proxy that is backing the
NumberPredicate interface will find the most appropriate
method on Exact to invoke based on the runtime type of
the argument to evaluate. So, for instance:

  exact.evaluate(new Float(12.3)); // returns false.
  exact.evaluate(new Integer(34)); // returns true.

There are several reasons having dynamic dispatch (and multiple
dispatch) is useful when programming for Java:

* If you want to extend an existing set of classes (for instance, to
add an aspect – see below), normally you’d create an interface
encapsulating your new feature, and make all the classes in the set
implement it. If you are not able to modify the classes, one
approach is to create wrapper classes for each, and somehow choose a
wrapper class at runtime based on the type of the object you’re
working with. You might implement this by
myObject.getWrapperClass() – but wait! This makes
getWrapperClass a new method that needs to be added:
precisely the problem you set out originally to solve.

Dynamic (single) dispatch helps you out here by conveniently
automating the required wrapping and method selection.

* You might also want to use multiple dispatch, dispatching on the
types of multiple arguments. Neither the Java language nor the Java
virtual machine supports multiple dispatch. In some cases
overloading suffices, but in many it does not.

Dynamic multiple dispatch reinterprets Java’s notion overloading as
actual multimethod dispatch. Syntactically, you’re writing
overloaded methods – but using DynamicDispatch, the semantics are
those of full multiple-dispatched generic functions.

Dynamic dispatch is useful for adding aspects. For example, you might
use dynamic dispatch to write a Java object pretty printer, or custom
serializer. I’ve employed it for writing a general equality function
which is independent of the object’s own implementation of equals,
which I find useful in unit tests. I’ll write about that in a later

If this kind of thing interests you, MultiJava, a compiler for a multiply-dispatched variant of Java, might be worth a look.




You are currently browsing the LShift Ltd. blog archives for June, 2006.



2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us