Archive for June, 2006
I’ve released my simple fan control program described in this entry (see also part one).
THIS CODE MAY MELT YOUR CPU - download only if you plan to read it, test it, and/or hack on it. The license makes it clear that it comes with no warranty.
I’ve already received an interesting email from Mark M Hoffman in reply to my post to the mailing list announcing it, drawing my attention to PID control loops. Looks like it could be a worthy avenue of investigation. I’ve replied with a description of one issue I see with applying a PID controller in this domain.
June 28th, 2006
Paul Crowley
Today I googled for “code as data”. The first hit that came up was this, a tutorial on Groovy that cheerfully proclaims that what is meant by “code as data” is closures! No, no, NO!!!!
“Code as data”, aka “code is data” signifies the ability to manipulate
code, i.e. to construct and take apart programs. At the most primitive level that can be accomplished
by representing source code as strings, which is possible in most programming languages. Going beyond that, many
programming languages define an AST data structure, with a parser and
pretty printer, that allows manipulation of code in a more structured
manner, enforcing most, if not all, syntactic constraints of the
language.
However, neither of these approaches fully capture what is meant by
“code is data” in the Lisp tradition. There code really is
data. There is no special AST data type and associated parser and
pretty printer. Instead all programs are represented in terms of the
ordinary data types of the language, such as symbols and lists. The
concrete syntax of programs is subsumed by the standard external
representation of data.
None of this has anything to do with closures.
June 28th, 2006
matthias
This paper proposes to reduce the workload of SSL servers by making the clients carry as much of the crypto-related load as possible. I think it’s possible to do even better.
Key agreement: In the above, the server only has to do an RSA public key operation, which is cheap if the exponent is low (eg three). However, we can do even better (and have a stronger security bound too) by using the Rabin operation - modular squaring - instead. This is more than twice as fast as the RSA operation with exponent three. Normally, Rabin encryption is slowed down by certain steps that are needed to handle the fact that modulo a large semi-prime, most numbers that have square roots have four of them, and the recipient has to know which one you mean. However, modern KEM schemes greatly reduce this cost, and Rabin-KEM encryption is just about the fastest public key operation I know of, with the exception of certain probabalistic signature checking schemes.
Signatures: a trick has been missed here. Modern “one-time” signature schemes (eg “Better than BiBa“) can actually sign many messages before the private key must be discarded for security, which in an online/offline signature scheme greatly reduces the number of documents to be signed. For even greater efficiency, a hash tree can be used to sign many one-time keys simultaneously. At the cost of imposing a small latency on all clients, we can even discard the one-time signatures, avoiding a patent, and directly use hash trees; as many clients try to connect, the server can place the documents to be signed in a hash tree and sign them all with one operation. This scheme scales very nicely: the server performs its public key operation at a constant rate of, say, ten per second, and no matter how many clients are trying to connect these signatures will serve to authenticate the server to them all. The clients may have to wait an extra tenth of a second for the connection to complete, but this cost will be small in the cost of connecting to a busy server.
Client puzzles I’m not sure I understand why mixing up the client puzzle step and the public key generation step is beneficial.
With this scheme, the server only has to do one modular squaring per client - and even that only when the client has proven its worth by solving a client puzzle. I wonder if it’s possible to do even better?
June 26th, 2006
Paul Crowley
When I started on this, I thought I’d be able to dash off a script to keep my CPU fan quiet in a few hours. I’ve just spent far too much of this weekend obsessively hacking on it and testing it, and after creating a tool of great sophistication, I have basically given up in defeat. I’m now using a thermostat-like approach; either the fans are on full or on minimum, nothing in between.
First installment of the saga
Continue Reading June 26th, 2006
Paul Crowley
I had a brief email exchange with the developers of
Dialyzer, the
static analyzer (some might call it a type checker) for
Erlang programs. Currently Dialyzer only
performs analysis on the functional fragment of Erlang and I was
enquiring whether to extend that to handle communication. That would
allow the detection of basic input/output mismatches, e.g. when a
message is sent to a process that does not match any of the patterns
it is willing to receive.
Going further, one might be able to employ the various techniques
developed for process algebras to reason about the concurrent
behaviour of Erlang programs and, for example, detect deadlocks and
enforce information flow security properties. A good example of such a
tool is TyPiCal. It
would be amazing to have something like that for Erlang. After all,
what makes Erlang interesting is not the functional programming aspect
currently checked by Dialyzer, but its support for concurrency,
distribution and fault-tolerance. It is incredibly difficult to
correctly implement systems that involve the latter. If there is any
area of programming in which we want the help of static analysis then
this is it!
Anyway, it turns out that there are no immediate plans to extend
Dialyzer in that direction. However, I was pointed at some related
research that I had hitherto been unaware of: Karol Ostrovský’s PhD
thesis, which in
Part II describes the
sound instantiation of Kobayashi’s generic type system for the
pi-calculus to session types,
extension of session types to multi-session types (which, afaict,
handle sessions that involve asynchronous comms, and servers that
handle multiple sessions without spawning),
application of multi-session types to type check communication of
Erlang processes.
Overall this looks like a promising attempt at constructing a
process-algebra-based type system that is decidable and yet expressive
enough to reason about non-trivial real-world protocols (IMAP4 is used
as an example). The theory behind it seems to be quite involved, but
that could just be due to the presentation format - a thesis rather
than a paper. It will be interesting to see whether this research is
carried any further and eventually materialises in tools for Erlang.
June 26th, 2006
matthias
E4X is a new ECMA standard (ECMA-357)
specifying an extension to ECMAScript
for streamlining work with XML
documents.
It adds objects representing XML to ECMAScript, and extends the syntax
to allow literal XML fragments to appear in code. It also supports a
very XPath-like notation for
use in extracting data from XML objects. So far, so good - all these
things are somewhat useful. However, there are serious problems with
the design of the extension.
If E4X objects were real objects, if there were a means of splicing a
sequence of child nodes into XML literal syntax, and if E4X supported
XML namespace prefixes properly, most of my objections would be dealt
with. As it stands, the overall verdict is “clunky at best”.
These are my main complaints:
It doesn’t do anything like Scheme’s unquote-splicing,
and so using E4X to produce XML objects is verbose, error-prone and
dangerous in concurrent settings.
There seems to be no way of splicing in a sequence of items -
I’d like to do something like the following:
function buildItems() {
return [<item>Hello</item>,
<item>World!</item>];
}
var doc = <mydocument>{buildItems()}</mydocument>;
and have doc contain
<mydocument>
<item>Hello</item>
<item>World!</item>
</mydocument>
What actually results is the more-or-less useless
<mydocument>Hello,World!</mydocument>
The closest I can get to the result I’m after is
function buildItems(n) {
n.mydocument += <item>Hello</item>;
n.mydocument += <item>World!</item>;
}
var doc = <mydocument></mydocument>;
buildItems(doc);
It’s full of redundant redundancy - it’s as verbose as XML, when you
can do so
much better.
There’s no toXML() method (or similar) for use in
papering over the yawning chasm between the XML objects and the rest
of the language: you can’t even make a Javascript object able to
seamlessly render itself to XML.
The new types E4X introduces aren’t even proper objects -
they’re a whole new class of primitive datum!
Because they’re not proper objects, you can’t extend the system. You
ought to be able to implement to an interface and benefit from the
language’s XPath searching and filtering operations. E4X is so close
to offering a comprehension
facility for Javascript, but it’s been short-sightedly restricted to
a single class of non-extensible primitives.
You can’t even construct XML tags programmatically! If the name of
the tag doesn’t appear literally in your Javascript code, you’re out
of luck (unless you resort to eval…) [[Update: I was wrong about this - you can write <{expr}> and have the result of evaluating expr substituted into the tag.]]
E4X XML objects have no notion of namespace prefixes (which are
required for quality implementations of XPath and anything to do
with XML signatures). Prefixes only turn up in the API as a means of
producing (namespaceURI,localname) pairs. This is actually how it
should be, but because there’s already broken software out there
that depends on prefix support, by not supporting prefixes properly
you preclude ECMAScript+E4X from being used for XML signatures or
ECMAScript-native XPath implementations.
In my opinion, E4X violates several programming
language design principles: most importantly, those of
regularity, simplicity and orthogonality, but
also preservation of information, automation and
structure. SXML, perhaps in
combination with eager
comprehensions, provides a far superior model for producing and
consuming XML. Sadly, there’s no real alternative for ECMAScript yet -
we’re limited either to library extensions, or to using the DOM
without any syntactic or library support at all.
June 24th, 2006
tonyg
As far as I can tell, no-one on the web has yet summarised in a single page all of the design principles MacLennan develops in his excellent Principles of Programming Languages (2nd edition, 1986, ISBN 0-03-005163-0): so here they are. I hope they’re as useful to others as they have been to me.
- Abstraction: Avoid requiring something to be stated more than once; factor out the recurring pattern.
- Automation: Automate mechanical, tedious, or error-prone activities.
- Defense in Depth: Have a series of defences so that if an error isn’t caught by one, it will probably be caught by another.
- Information Hiding: The language should permit modules designed so that (1) the user has all of the information needed to use the module correctly, and nothing more; and (2) the implementor has all of the information needed to implement the module correctly, and nothing more.
- Labeling: Avoid arbitrary sequences more than a few items long. Do not require the user to know the absolute position of an item in a list. Instead, associate a meaningful label with each item and allow the items to occur in any order.
- Localised Cost: Users should only pay for what they use; avoid distributed costs.
- Manifest Interface: All interfaces should be apparent (manifest) in the syntax.
- Orthogonality: Independent functions should be controlled by independent mechanisms.
- Portability: Avoid features or facilities that are dependent on a particular machine or a small class of machines.
- Preservation of Information: The language should allow the representation of information that the user might know and that the compiler might need.
- Regularity: Regular rules, without exceptions, are easier to learn, use, describe, and implement.
- Security: No program that violates the definition of the language, or its own intended structure, should escape detection.
- Simplicity: A language should be as simple as possible. There should be a minimum number of concepts, with simple rules for their combination.
- Structure: The static structure of the program should correspond in a simple way to the dynamic structure of the corresponding computations.
- Syntactic Consistency: Similar things should look similar; different things different.
- Zero-One-Infinity: The only reasonable numbers are zero, one, and infinity.
June 24th, 2006
tonyg
Emacsen.org has a nice roundup of the (apparently only) four javascript-mode implementations for Emacs. I went for number three, Karl Landström’s javascript.el, and it’s been working very well.
June 24th, 2006
tonyg
One of the interesting issues in
implementing dynamic dispatch for
Java is that the basic C3 linearization algorithm isn’t a very good
fit for the complexities of Java’s subtyping. (Note: the following
paragraphs rely on the reader having a basic understanding of the
details of C3 linearization.)
Java lists a class’s implemented interfaces separately from its
superclass. C3 requires, for each class, a list of direct superclasses
as input - so to adapt it for use with Java, we have to choose how to
combine each class’s superclass with its implemented interface list:
for instance, the superclass could be placed at the beginning or at
the end of the interface list.
In Java any interface is assignable to Object. So for C3 to make
sense, all interfaces which have no super-types have Object as a
super-type. If Object is in the super-type list of a type, it must be
the last thing - otherwise linearization will always be impossible. So
the most obvious thing to do is to include the super-class last in the
list of super-types.
This mostly works, but for the way collections from
java.util are implemented: The various abstract
collections implement their corresponding interface, but the actual
implementations don’t directly implement the corresponding interface,
so for example AbstractSet implements Set,
and HashSet extends AbstractSet, but does
not implement Set directly. This is a common pattern.
If, then, you always choose to put the super-class at the end of the
list of super-types while performing linearization for C3, this
results in an inconsistent linearization for all of Java’s built in
collections.
So I ended up doing the following by default: For any super-class
other than Object, the super-class goes first in the list of
super-types. If the super-class is Object, it gets pushed to the
end. This works in an intuitive way in lots of cases, and the
implementation supports pluggable ordering, should you need to do
something different.
June 23rd, 2006
david
Dynamic dispatch is a mechanism for selecting a method based on the
runtime types of the parameters supplied. Java dispatches instance
methods dynamically, using the runtime type of the receiver to choose
the code to invoke and ignoring the types of the other parameters
(just like Python and many other object-oriented languages). This is
called single dispatch. Unfortunately, Java’s dispatching is limited
in two important ways: it doesn’t allow class extensions, and
it doesn’t support multiple dispatch, as implemented in many
other object-oriented languages.
I have written some code which uses reflection and proxy generation to
conveniently implement dynamic multiple dispatch for Java. It uses C3
linearization to determine the method to invoke. This algorithm
was originally devised for Dylan. You can get the distribution here.
This implementation supports subtyping of both arrays and primitive
Java types such as int, byte,
char, but does not yet support Java 1.5’s generics. The
subtyping relation for arrays and primitive types is based on Java’s
notion of assignability - see the documentation for details.
Here’s a trivial example of using the dynamic dispatch library:
import net.lshift.java.dispatch.DynamicDispatch;
// define an interface
public interface NumberPredicate {
public boolean evaluate(Number n);
}
// implement it for some argument types
public class Exact {
public boolean evaluate(Float f) { return false; }
public boolean evaluate(Double f) { return false; }
public boolean evaluate(Number n) { return true; }
}
// create a dynamic dispatcher
NumberPredicate exact = (NumberPredicate)
DynamicDispatch.proxy(NumberPredicate.class, new Exact());
Now, code making use of exact can call the
evaluate method with a Number instance, and
the DynamicDispatch proxy that is backing the
NumberPredicate interface will find the most appropriate
method on Exact to invoke based on the runtime type of
the argument to evaluate. So, for instance:
exact.evaluate(new Float(12.3)); // returns false.
exact.evaluate(new Integer(34)); // returns true.
There are several reasons having dynamic dispatch (and multiple
dispatch) is useful when programming for Java:
If you want to extend an existing set of classes (for instance, to
add an aspect - see below), normally you’d create an interface
encapsulating your new feature, and make all the classes in the set
implement it. If you are not able to modify the classes, one
approach is to create wrapper classes for each, and somehow choose a
wrapper class at runtime based on the type of the object you’re
working with. You might implement this by
myObject.getWrapperClass() - but wait! This makes
getWrapperClass a new method that needs to be added:
precisely the problem you set out originally to solve.
Dynamic (single) dispatch helps you out here by conveniently
automating the required wrapping and method selection.
You might also want to use multiple dispatch, dispatching on the
types of multiple arguments. Neither the Java language nor the Java
virtual machine supports multiple dispatch. In some cases
overloading suffices, but in many it does not.
Dynamic multiple dispatch reinterprets Java’s notion overloading as
actual multimethod dispatch. Syntactically, you’re writing
overloaded methods - but using DynamicDispatch, the semantics are
those of full multiple-dispatched generic functions.
Dynamic dispatch is useful for adding aspects. For example, you might
use dynamic dispatch to write a Java object pretty printer, or custom
serializer. I’ve employed it for writing a general equality function
which is independent of the object’s own implementation of equals,
which I find useful in unit tests. I’ll write about that in a later
post.
If this kind of thing interests you, MultiJava, a compiler for a multiply-dispatched variant of Java, might be worth a look.
June 23rd, 2006
david
Previous Posts