technology from back to front

Archive for September, 2006

LShift and NMA Awards

New Media Age have just released their guide to the Top 100 Interactive Agencies for 2006. LShift ranks as the UK’s 8th best technical agency and the 4th most efficient.

To celebrate, LShift’s MD, Andy Wilson, attended the NMA Top 100 event at the Marx Lounge on Wednesday September 27.

Andy comments “It’s really good to get recognition. Concentrating on back-end projects with high-tech requirements means that we often do not get direct exposure to clients. This kind of recognition is important to us as we plan to expand our Advisory business, working directly with clients to advise them on issues ranging from the recruitment of software architects to venture capital due diligence”.


Hikij: Haskell and Javascript based Wiki

Taking Matthias’ [introduction to AJAX and Haskell][inspiration] as inspiration, I decided to try to write a complete wiki using Haskell on the server side. This also grew from discussions we’ve had in recent weeks about distributed wikis and similar ideas, though Hikij has as yet none of those features. Because of my use of Javascript 1.6 features (and probably some other non-standard library functions), it’s Mozilla 1.8 or Firefox 1.5 or better only. But if you have such a browser, you can see the [Hikij wiki][hikij].


Source and details available there.


DLAs, mashups and canapes

TV Chefs and blogs have had it — networking evenings and portals are back

Last Thursday I went to and event titled Digital Lifestyle Aggregators at the ever so posh BT Centre and learnt three things about this latest web 2.0 trend

  • Digital Content Aggregation is when you present a load of different content sources in a single interface (yes, what used to be called a Portal, but is now, like most consumer marketing things, a lifestyle)
  • A Mashup is when you put this stuff together (integrating RSS feeds and using stuff like the Google Earth or other web2-style API)
  • If you want to get funded, do a DCA Mashup – although it’s probably already too late, as pretty much everyone at the event seemed to have found VCs recently. Even BT themselves sounded committed to it. Their director of strategy suggested they would be embracing the new internet economy by promoting Skype, amongst other things, within their own portal.

Having been integrating feeds and services from other sites for years it’s strange to see businesses so conceptually and technically simple seeing significant funding. It has to be said that there are some great interface designs in some of these companies sites – Netvibes for example is probably the best of the crop. We have yet to see whether they have the marketing muscle to see it through.

Oh yes, I almost forgot, BT do great canapés.


EC2 Latency

We want to use Amazon’s [Elastic Compute
Cloud]( to run scalability tests, rather
than buying/leasing lots of kit that we only need for a few hours
every now and then. There are a few challenges though. The biggest is probably translating the results from the EC2 tests
to the envisaged deployment scenarios. There are several factors to
consider here: hardware, operating system, network,
impact of virtualisation.

Amazon provide details of the *virtual* hardware:

> Each instance predictably provides the equivalent of a system with a
> 1.7Ghz Xeon CPU, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of
> network bandwidth.

We also know that the default operating system is Fedora Linux.

One figure that is notable for its absence is network latency, in
particular between several EC2 instances. Latency has a major impact
on scalability, so is crucial that we know what it is.

A ping between two of my EC2 instances takes approximately 250
microseconds. That is about three times higher than I get between
machines on our local network, but it is roughly the same when the machines are separated by a firewalls.

I have tried the ping test a few times, between different instances
created at different times. The latency was always the same. However,
given the EC2 setup, I suspect that latency may vary quite
considerably, depending on where the instances happen to be running -
the same location, the same rack or even the same machine. How much
variation is there in instance placement?

For example, say I deploy some app on four instances one day, then
tear it down and deploy it again some other day. If the performance of
my app is inter-instance latency-bound, could I end up with
performance differences of several orders of magnitude, i.e. if I was
lucky to get all four instances deployed on a single physical machine
in one case, and unlucky to get four machines on different continents
in the other?

Hopefully Amazon will release some more information on their EC2 setup that may answer some of the above questions. Users of the service can also help by running some tests and publishing the results.


Busy busy

There’s been nothin’ doin’ blog-wise at LShift for the past week or so; we all seem to be wrapped up in projects.

Briefly though, collectively we’re

- Tickling Erlang
- Thinking about what direction to lead Icing in
- Finding all the best documentation the Web has to offer on ASP.NET 2.0
- Getting intimate with Director
- Staging shoot-outs between Ruby-on-Rails, Django, and cohorts

… so we’re not bored.


Managing CSS: part 1, Factoring

In these days of semantic markup, liquid three column layouts and image replacement it’s quite evident that using CSS is just not as simple as it promises to be. There’s not only the flow and box models to internalise, but the numerous quirks in how browsers implement them, and the constraints imposed by accessibility guidelines, and—well, special cases galore. As usual the solution is care, attention, and better tools.

Grouping rules

To start, there is the matter of how to arrange CSS code: One reason why CSS can be trouble to maintain is that there’s no obvious best factoring of rules. Do you clump like properties together under a single selector

h1, h2, h3, div.title  {color: #f3c;}

on the basis that there’s then one place to change that property of these related elements; or, group like selectors

h1 { color: #f3c; font-size: largest; font-weight: bold; }

so that the entire style of a particular set of related elements can be changed at once?

There is at least one program that will automagically group rules for you, but it does it dumbly, collecting rules by common properties, and as it’s been pointed out, that’s not always what you want.

The real answer is that it depends on for what each property is intended. If a property is part of the overall theme, it should be in one place. Colours often fall into this category, as do fonts. If the property is a consequence of some extra semantics you’re asserting with your selector, it belongs with all the other such properties under one rule. For example, if I create a distinction between hyperlinks to elsewhere in my site and hyperlinks leading away from my site by using the class external, I would put all the properties related to external links in one rule.

These rules may have related selectors:

// Thematic property --- this colour is part of my colour scheme
h1, h2, h3, div.title {color: #0f0;}

// Semantic properties -- I'm asserting a set of things that are titles
*.title {
  margin-left: 32px;
  background-image: url('images/title-marker.gif');
  background-repeat: no-repeat;

(Note to CSS standards authors: some extra sugar for selectors would be helpful, like distributing combinators over parentheses; for example thead (th, td) as shorthand for thead th, thead td)

Cascade Simply, Stupid

The typical practice is to put hooks in the HTML upon which to hang styles. An id attribute can be used for unique (per page) elements, and the class attribute can be used to create arbitrary sets of elements. This coupling between the HTML and the CSS introduces a maintenance burden, since when the HTML changes the CSS must change, and vice versa.

There’s many ways to select a particular element, and there’s many ways of getting a style to apply to an element. A question Mark Bernstein poses is how do you make sure your style applies to precisely the right set of elements, now and forever?

We want to minimise three things: the knock-on effect of changing the CSS or the HTML, the possibility of unintended matches, and dependencies between rules. We can go a long way with both by keeping selectors simple and predictable.

Use either very general selectors with thematic properties (“set all body text to green”), or very specific selectors (“the element with this ID has a blue background”). This makes it easier to predict which elements will be affected by a rule, and reduces the chance that undesired properties will be inherited.

If you use relative units on elements that can be nested, you can end up with effects like shrinking text. This applies not only to single rules, but to combinations of rules: { font-size: 0.8em; }
// OK, since it won't be nested

fieldset#login { font-size: 0.9em; }
// but what if we put a in that fieldset?

To avoid this situation, keep relative units away from generic containers. Let paragraphs default to whatever the body text size is, and adjust only for non-nesting elements.

Use selector connectors (descendant, child, sibling) very sparingly: they appear specific without being very predictable. Qualify sub-selectors with a class, or better an ID, to help avoid matching undesired elements.


Refactoring is the process of removing duplication and other bad smells from code. We can do the same for CSS.

What are the bad smells?

Sets of repeated properties in several rules often signal that there’s a common class to be extracted. You have to be careful though: it may be coincidence. A rule of thumb is that if you can consider the set of selected elements semantically alike, then you can replace the rules with a class. The semantic similarity means there’s a reasonable chance that the class, and the elements to which it it is attributed, aren’t going to change.

Class names or IDs that mention styles, for example .redBox, suggest that thematic properties are being mixed with semantic properties (or possibly that you were running short on imagination that day). Move the thematic properties into their own rule and think of a new name that reflects why you’re saying the elements with this class are different to other elements; e.g., .warning. Long chains of combined selectors, for example body#about-page div.sidebar p a {...} may mean you’re trying too hard to narrow a selection — use a class or ID further down the chain to simplify the rule and make the selection clearer: p.about-page-links a {...}.

The hypothetical refactoring browser

What’s really needed is a refactoring browser. Not only would this browser let you view and edit CSS and HTML (and HTML-like code) side-by-side, it would understand how CSS applies to HTML, so you could

- see which rules apply to an element
- see which elements a selector matches
- see how the effective properties of an element are derived (inherited properties, relative units)
- see dependencies between rules (which rules will cascade)

It would also offer refactoring transformations:

- move common properties into another rule
- merge similar rules (and put the remainder into another rule)
- move a class into or out of a nested element
- collapse (redundantly) cascading rules

By definition, refactoring transformations should not alter the observable output of the code; however, you will also want to know when something you’ve done has failed to preserve that, so the refactoring browser should tell you.

Xylescope comes very close—it does a excellent job of showing you both which rules apply to an element and to which elements a rule applies, and has some handy visualisations of CSS properties (especially the box model). Sadly it doesn’t have a stellar editing facility, or any refactoring, but we can hope that culturedcode are planning to extend it.

Care and attention

While tools go an awfully long way, the most important thing is to give CSS your care and attention. Don’t let it get out of hand, by constantly refactoring; update the CSS when you update the HTML to remove unneeded rules; and confirm that each new rule does what you think it does. This last activity is the essence of testing, which I’ll talk about next time.


Erlang processes vs. Java threads

Earlier today I ran a simple test of Erlang’s process creation and teardown code, resulting in a rough figure of 350,000 process creations and teardowns per second. Attempting a similar workload in Java gives a figure of around 11,000 thread creations and teardowns per second – to my mind, a clear demonstration of one of the main advantages of Erlang’s extremely lightweight processes.

Here’s the Java code I used – see the earlier post for the Erlang code, to compare:

// Java 5 – uses a BlockingQueue.
import java.util.concurrent.*;

public class SpawnTest extends Thread {
public static void main(String[] args) {
int M = Integer.parseInt(args.length > 0 ? args[0] : “1″);
int N = Integer.parseInt(args.length > 1 ? args[1] : “1000000″);
int NpM = N / M;
BlockingQueue queue = new LinkedBlockingQueue();
long startTime = System.currentTimeMillis();
for (int i = 0; i < M; i++) { new Body(queue, NpM).start(); }
for (int i = 0; i < M; i++) { try { queue.take(); } catch (InterruptedException ie) {} }
long stopTime = System.currentTimeMillis();
System.out.println((NpM * M) / ((stopTime – startTime) / 1000.0));

public static class Body extends Thread {
BlockingQueue queue;
int count;
public Body(BlockingQueue queue, int count) {
this.queue = queue;
this.count = count;
public void run() {
if (count == 0) {
try { queue.put(this); } catch (InterruptedException ie) {}
} else {
new Body(queue, count – 1).start();


How fast can Erlang send messages?

My previous post examined Erlang’s speed of process setup and teardown. Here I’m looking at how quickly messages can be sent and received within a single Erlang node. Roughly speaking, I’m seeing 3.4 million deliveries per second one-way, and 1.4 million roundtrips per second (2.8 million deliveries per second) in a ping-pong setup in the same environment as previously – a 2.8GHz Pentium 4 with 1MB cache.

Here’s the code I’m using – time_diff and dotimes aren’t shown, because they’re the same as the code in the previous post:

-export([oneway/0, consumer/0, pingpong/0]).

oneway() ->
N = 10000000,
Pid = spawn(ipctest, consumer, []),
Start = erlang:now(),
dotimes(N – 1, fun () -> Pid ! message end),
Pid ! {done, self()},
receive ok -> ok end,
Stop = erlang:now(),
N / time_diff(Start, Stop).

pingpong() ->
N = 10000000,
Pid = spawn(ipctest, consumer, []),
Start = erlang:now(),
Message = {ping, self()},
dotimes(N, fun () ->
Pid ! Message,
receive pong -> ok end
Stop = erlang:now(),
N / time_diff(Start, Stop).

consumer() ->
message -> consumer();
{done, Pid} -> Pid ! ok;
{ping, Pid} ->
Pid ! pong,

%% code omitted – see previous post


How fast can Erlang create processes?

Very fast indeed.

1> spawntest:serial_spawn(1).

That’s telling me that Erlang can create and tear down processes at a
rate of roughly 350,000 Hz. The numbers change slightly – things slow
down – if I’m running the test in parallel:

2> spawntest:serial_spawn(10).
3> spawntest:serial_spawn(10).

4> spawntest:serial_spawn(100).
5> spawntest:serial_spawn(100).

[Update: I forgot to mention earlier that the system seems to spend 50% CPU in user and 50% in system time. Very odd! I wonder what the Erlang runtime is doing to spend so much system time?]

Here’s the code for what I’m doing:


serial_spawn(M) ->
N = 1000000,
NpM = N div M,
Start = erlang:now(),
dotimes(M, fun () -> serial_spawn(self(), NpM) end),
dotimes(M, fun () -> receive X -> X end end),
Stop = erlang:now(),
(NpM * M) / time_diff(Start, Stop).

serial_spawn(Who, 0) -> Who ! done;
serial_spawn(Who, Count) ->
spawn(fun () ->
serial_spawn(Who, Count – 1)

dotimes(0, _) -> done;
dotimes(N, F) ->
dotimes(N – 1, F).

time_diff({A1,A2,A3}, {B1,B2,B3}) ->
(B1 – A1) * 1000000 + (B2 – A2) + (B3 – A3) / 1000000.0 .

This is all on an Intel Pentium 4 running at 2.8GHz, with 1MB cache, on Debian linux, with erlang\_11.b.0-3\_all.deb.


London 2.0

I went to London 2.0 last night, a short bus ride and walk away in Fleet Street. It’s a gathering of the kind becoming more popular — there was XP enthusiasts on the other side of the pub — involving techy chat and a few pints.

The main event was Jason Huggins demonstrating some work-in-progress involving Selenium. He’s intending to show it at the Google Automated Testing conference later this week. If he achieves his ambition, it’ll be impressive as well as very useful.

The nominal topic of the evening was Web 2.0, but the general chat was pretty eclectic:

- Functional programming (lots of “I’ve been wanting to try ..”), and especially functional programming for the Web (I was able to mention our efforts with Icing and AJAXy Haskell)
- Languages on the CLR: SML.NET, IronPython, F#
- Experiences with virtualisation software
- Books: “Freedom Evolves”, by Dan C. Dennett, and those perennial favourites, “Goedel Escher Bach”, and “The Emperor’s New Mind”

If you’re inclined to geekery and optionally beer, it’s a great way to spend a weekday evening.




You are currently browsing the LShift Ltd. blog archives for September, 2006.



2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us