technology from back to front

Archive for August, 2006

Small and healthily skeptical is good

Tim Malbon says he’s pleased to be involved in this Web thing, but appears to conflate lack of full assimilation with brain-washing.

One can be a tiny bundle of techno-lust and not believe that successive versions of the Web will lead directly to a shiny new future.

Posting photos to Flickr (Web2.0) seems like something of a tenuous, circuitious route to finding housing for homeless children (shiny future). On the other hand, if you followed that link you’re contributing to something like a counter-argument.

It’s not that I begrudge boosterism, because some interesting things come about that way. I just find the juxtaposition of They see [the Web] as ‘work’ rather than the defining social revolution du jour with They have the air of people who belong to a mind control cult as, well, exquisite.

Having said all that: I quite agree with Tim’s sentiments that Small is Good, creating neat technology isn’t work, and I’d rather be passionate about what I do than uninterested (happily, I am the former).


Diff for Javascript

Toward the end of last week, I found myself wondering about implementing a diff algorithm in Javascript. It turns out there’s already at least one available attempt at this, by John Resig (direct link: jsdiff.js). Unfortunately, the implementation is a little buggy (try diffString("the quick", "the quick fox"), for example), so I went hunting for source material to implement my own.

I ended up finding J. W. Hunt and M. D. McIlroy, “An algorithm for differential file comparison”, Bell Telephone Laboratories CSTR #41 (1976) (postscript), which turned out to be quite implementable.

Here’s my implementation: hunt-mcilroy.js. With luck, I’ve even implemented something close to the described algorithm.

As an example, the output from

diff(“the quick brown fox jumped over”.split(/\s+/),
“the quick fox jumps over”.split(/\s+/))



The next step is to implement a diff3 equivalent, and then I’ve the tools to implement rudimentary version-control. It’d sure be a fine thing to see TiddlyWiki evolve darcs-like distributed change-propagation…


Subclassing in JavaScript, part 2

In my last entry, I described a problem, and promised a solution. There are many solutions online and in various JavaScript toolkits; as far as I can tell this one is the most elegant of the bunch, and I’d be pleased to see it replace the various other solutions out there.

The main goals of this approach are:

* The problem described in the previous post are avoided. What is per-instance stays per-instance, and what is shared stays shared; subclassing doesn’t change one into the other.

* The code to create a subclass looks reasonably elegant and straightforward and not too wordy.

* The result is reasonably efficient in space and time

The approach here also meets one nice but less important goal:

* The approach can be used to subclass from a class not specifically built to work with this approach – in other words, when not subclassing, you make classes in the normal way.

I shan’t include a comparison with any other specific solutions here, but if there are any you’re interested in, ask me.

Code first, explanations second.

// The code to support this approach to subclassing function extend(superclass, constructor_extend, prototype) { var res = function () { superclass.apply(this); constructor_extend.apply(this, arguments); }; var withoutcon = function () {}; withoutcon.prototype = superclass.prototype; res.prototype = new withoutcon(); for (var k in prototype) { res.prototype[k] = prototype[k]; } return res; } // An example of the approach in use // Make a class in the normal way function Parent () { this.array = []; } Parent.prototype = { someMethod: function () { this.array.push(this.array[0]); } } var Child = extend(Parent, function () { this.somethingelse = "hello Mum!"; }, { anotherMethod: function () { this.array.push(this.somethingelse); } }); // And demonstrate that it works properly aa = new Child(); bb = new Child(); aa.array.push("aa"); bb.array.push("bb"); aa.someMethod(); bb.anotherMethod(); print(aa.array); // prints "aa,aa" print(bb.array); // prints "bb,hello Mum!"

So what’s going on? This is almost the same as the traditional “Child.prototype = new Parent()” approach, but with a couple of wrinkles.

We create a constructor function that explicitly calls the no-argument constructor of the parent on the object, then calls the “constructor_extend” function passed in, passing it the constructor arguments. We then set a prototype object on this constructor function.

The class used to create the prototype object is a variant of the Parent class, which is the same except that the constructor is replaced with a no-op. It will have the same \_\_proto\_\_ object as any Parent object, but it won’t have any instance fields set. This means that nothing that was per-object in the Parent becomes shared in the Child.

Once we’ve created this blank prototype object, we assign it as the prototype to our constructor function. We then copy in the methods passed in the “prototype” argument, in order to add methods and other shared things to Child objects that aren’t present in Parent objects. The constructor function is then returned, ready to serve as the new class.

One slight inflexibility with this approach is that the superclass constructor is always called with no arguments. This is often fine, but sometimes you need to be able to pass an argument in. To address this, we break our “extend” function into two parts:

function general_extend(superclass, constructor, prototype) { var withoutcon = function () {}; withoutcon.prototype = superclass.prototype; constructor.prototype = new withoutcon(); for (var k in prototype) { constructor.prototype[k] = prototype[k]; } return constructor; } function extend(superclass, constructor_extend, prototype) { return general_extend(superclass, function () { superclass.apply(this); constructor_extend.apply(this, arguments); }, prototype); }
Now if we want to call the constructor with arguments, we use “general_extend” instead of “extend”:
var Child = general_extend(Parent, function () { Parent.apply(this, ["an argument"]); this.somethingelse = "hello Mum!"; }, { anotherMethod: function () { this.array.push(this.somethingelse); } });
I can’t find a convenient way to implement “super” – if you want to refer to the superclass, for example to defer to the superclass version of an overriden method, you’ll have to do so explicitly. The nearest I can get is this:
var Child = (function (uber) { return general_extend(uber, function() { uber.apply(this, ["an argument"]); this.somethingelse = "hello Mum!"; }, { printState: function() { uber.prototype.printState.apply(this); print("somethingelse:" + this.somethingelse); } }); })(Parent);
but that’s far from pretty. Until we get macros, though, I think this is the cleanest and most flexible way of doing subclassing in JavaScript I’ve seen so far.

Update: I’ve just realised that I have mistaken how the inheritance system presented in KevLinDev works, and in fact the key idea is the same as mine – that of creating a copy of the superclass that has a no-op constructor. Nonetheless I slightly prefer my expression of it, in particular the ease of adding methods and the way the separation between general_extend and extend gives you control over the superclass constructor if you need it. I’m not sure there’s much benefit to the definition of baseConstructor and superClass.

Paul Crowley

Sebastian Egner joins LShift

We are very pleased to announce that Sebastian Egner has just joined our team.

Sebastian, with his background in the development of consumer electronics, brings new skills and plenty of new ideas.


Fair messaging in Erlang

One of the new projects that I’m working on involves a messaging infrastructure in Erlang. Without boring you with the details, the basic idea is that there are two types of messages, A and B and these are both sent to a thread (or a *process* in Erlang). One A must be paired with one B before the A and B can be discarded. Performance is an issue so this pairing must be fast. Several approaches were developed in trying to make this go very fast:

1. The first approach is also the dumbest. All A and B messages are sent to the same process. The process deals with these messages in a fifo order and thus must internally maintain queues of As and Bs, matching them as becomes possible. The problem with this is that flooding the process with As or Bs cripples performance because of processing the queue in fifo order. So a DoS attack is very possible and would often accidentally occur (most traffic is bursty).
1. The next idea is to not process the messages in a fifo order. This is possible because of Erlang’s *receive* statement which can do pattern matching on the messages in the queue. The problem is that the process must scan every message in the queue in a fifo order each time through the receive block until it finds one that matches. So whilst the DoS attack can’t happen in the same way, it can still cripple performance as the message queue must be scanned until you reach the end and find the first message of the *other* type so that the pairing can occur. So this cripples performance again.
1. Use three processes. Send messages of one type to one process and of the other type to another process. These processes are then *buffers*. They process messages in a fifo order (which is fast) and put all the messages they receive into an internal queue. The third process (the *pairer*) then sends *fetch* messages to these buffers which then send a predetermined number of messages in their internal queue to the pairer. The pairer then scans its messages in a non-fifo way but because the message queue is effectively bounded, there is no significant performance drop. The points are these:
* There is potentially a DoS attack possible which prevents the buffers from receiving the *fetch* message. However, firstly by processing the message queue in a fifo order it *will* be received, just possibly not immediately; and secondly, the buffers simply remove A or B messages from their message queues and place them in an internal queue, thus there is very little processing done. So getting through several thousand A or B messages does not take and real amount of time.
* The pairer only asks for more messages (via *fetch* messages sent to the buffers) iff it has exhausted its quota of messages: i.e. it asks for *n* messages from each buffer and will only ask for more if it has received *n* messages from each buffer. This means that its queue will never be more than 2 *n* messages long.
* Careful testing reveals that *n* = 7 is the best performance on the hardware I have available. This balances the cost of the pairer processing messages out of order with the extra round trip to the buffers. Lower *n* means that there are too many round trips sending *fetch* messages to the buffers but the non-fifo processing is really cheap. Higher values of *n* mean the pairer’s message queue gets too big so the non-fifo processing costs too much.
* By using three processes, it can make good use of thread level parallelism hardware. The two buffers can get through their message queues in parallel as can the pairer. There’s also therefore the possibility of distribution across multiple machines.
* It’s very possible to alter the pairer so that it sends the *fetch* messages after it has achived *m* pairings where *m* < *n*. This means the pairer's message queue would be a maximum of 2 *n* + 2(*n* - *m*) messages long but would reduce the delay between the pairer sending the *fetch* messages and the buffers being able to receive the *fetch* message and send messages on to the pairer - effectively the *fetch* message ends up higher up in the buffers' message queues.

It’s quite ironic that a language that seems, at least on the face of it, ideal for messaging-type applications turns out to have implicit semantics that, if not hindering, at least make you think much harder when trying to implement certain requirements…




You are currently browsing the LShift Ltd. blog archives for August, 2006.



2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us