technology from back to front

Archive for April, 2008

STOMP adapter updated for RabbitMQ 1.3.0

I’ve updated our STOMP adapter for RabbitMQ to fix a bug reported by Carl Bourne. In the process, I updated the code to work with the latest snapshots of RabbitMQ, including the currently-released version, v1.3.0.

You can get the code by checking it out from our repository with

hg clone http://hg.rabbitmq.com/rabbitmq-stomp/
hg update rabbitmq_v1_3_0_branch

UPDATE: use the default branch these days, unless you’re still running 1.3.0!

or you can instead download a snapshot of the current state of the adapter[1], currently at revision 90dd1726fe0b.

(Update: I forgot to mention that the mercurial repository has two branches in it: default, which tracks our internal RabbitMQ server repository, and rabbitmq\_v1\_3\_0\_branch, which should stay compatible with the 1.3.0 server release. Thanks to Aman Gupta, who pointed out the problem in a comment below!)

Here’s a summary of how to build and run a STOMP-enabled RabbitMQ broker – for more details, see the original post on the topic:

1. First, retrieve the RabbitMQ server 1.3.0 source code, and unpack it:
curl http://www.rabbitmq.com/releases/source/rabbitmq-1.3.0.tar.gz | tar -zxvf -

2. Next, grab the latest STOMP adapter (here we download a copy of the rabbitmq\_v1\_3\_0\_branch rather than the main trunk):
curl http://hg.rabbitmq.com/rabbitmq-stomp/archive/rabbitmq\_v1\_3\_0\_branch.tar.gz | tar -zxvf -

3. Compile the server itself:
make -C rabbitmq-1.3.0/erlang/rabbit

4. Finally, compile the adapter, and start the server with extra options that cause the adapter to start too:
make -C rabbitmq-stomp-rabbitmq\_v1\_3\_0\_branch run

If this is successful, you should end up with “starting STOMP-listeners …done” and “broker running” in your terminal. At this point you can try out the service – for instance, you can run Carl’s test cases if you have ruby and rubygems handy:

sudo apt-get install ruby
sudo apt-get install rubygems
sudo gem install stomp
ruby rabbitmq-stomp-rabbitmq_v1_3_0_branch/priv/tests-ruby/cb-receiver.rb

and in another window

ruby rabbitmq-stomp-rabbitmq_v1_3_0_branch/priv/tests-ruby/cb-sender.rb

It will transfer 10,000 short messages, and end up displaying

...
Test Message number 9998
Test Message number 9999
All Done!

in the receiver-side terminal.

If you’re interested in the gory details of the bug-fix itself, you can see the relevant patch here. The problem was that the code that handled abrupt socket closure wasn’t handshaking with enough of the internals of the server to ensure that the last few work items were being processed successfully. Trapping socket closure in the STOMP adapter code, and politely handshaking, turned out to be all that was required. An alternative workaround would be to use STOMP’s DISCONNECT method before closing the socket on the client side.


Footnote 1: Note that despite the misleading URL, the snapshot download really is of the STOMP adapter, and not of the broker itself! I’m making use of hgwebdir‘s archive-download feature here.

by
tonyg
on
30/04/08

Abstraction in CSS

I’ve href=”http://www.lshift.net/blog/2006/09/13/managing-css-part-1-factoring”>written
before, to no acclamation, about the difficulty in factoring
CSS. After more talking to and working with href=”http://www.48th.co.uk/”>people who use CSS a lot more than
I do (and are commensurately more skillful), I think the difficulty
is the level of abstraction: CSS is declarative, but it is not
very abstract.

Usually the idea with href=”http://en.wikipedia.org/wiki/Declarative_language”>declarative
languages is to describe the desired outcome and let the
computer do the figuring out how. CSS only deals with mechanism.
It does abstract from the how of layout and rendering; but, I would
argue, not in a very useful way: I want to say “make sure
this image lines up”, but what I can express is “nudge the image
down by ten pixels”.

It is like navigating in a rocket ship by manually controlling the
thrusters, when a computer is perfectly capable of working out the
whole thing ahead of time. (Of course you may want to pilot the
rocket ship manually for fun’s sake – Jef Raskin noted that
lack of expressiveness was what made games fun and most other
human-computer interfaces rubbish.)

However: CSS is what we have. What can we do to make it a better tool?

We can add the ability to express intent. The simplest example is
with constants: if I could write

  @let COLUMN_WIDTH: 200px;

  #foo {
    width: COLUMN_WIDTH;
  }

  #bar {
    margin-left: COLUMN_WIDTH;
  }

it makes it obvious that the margin and the width are deliberately the
same. It also means that a value only needs to be
given in one place – handy for colour schemes.

Once there are symbolic values, it follows to have
expressions in value position. This is useful for layouts that
involve margins and widths in some combination:
margin-left: COLUMN_ONE_WIDTH + COLUMN_TWO_WIDTH.

Another way to increase bang for syntactic buck is to add in the
ability to abstract idiom: instead of

    .box .c,
    .box .t,
    .box .b,
    .box .b div {
  background: transparent url(../img/extra-box-bg.gif) no-repeat top right;
    }
    #glass .box .c,
    #glass .box .t,
    #glass .box .b,
    #glass .box .b div {
 background: transparent url(../img/timeline-box-bg.png) no-repeat top right;
    }
    .side .box .c,
    .side .box .t,
    .side .box .b,
    .side .box .b div {
  background: transparent url(../img/side-box-bg.gif) no-repeat top right;
    }
  

we could have

 @def rounded(SELECTOR, BG) {
   SELECTOR .c,
   SELECTOR .t,
   SELECTOR .b,
   SELECTOR .b div {
  background: transparent url(BG) no-repeat top right;
 }

 rounded(.box, ../img/extra-box-bg.gif);
 rounded(#glass .box, ../img/timeline-box-bg.png);
 rounded(.side .box, ../img/side-box-bg.gif);

Of course, it is easy enough to write a program to generate CSS
using some general-purpose programming language, or even a
templating language; but I think it will be more fruitful to grow
the established language outwards. Doing so maintains hygeine
– avoiding pitfalls of the “building SQL strings” kind, for
example – and allows for analysis.

This is all wishfulware until someone makes it happen. Since I can’t
much influence the specifications or browser implementations, I’m
working on a compiler that targets plain-old CSS. Elsewhere, there
is a
proposal
for adding constants to CSS, and a href=”http://alex.dojotoolkit.org/?p=625″>proposed mechanism for
rule reuse.

by
mikeb
on

Choosing a new version control system

(Continued from Moving away from CVS)

The wealth of options for a replacement for CVS presents us with a problem. We can’t choose a version control system by comparing feature lists: what seems perverse when presented in the manual may become natural in real use (which is the reaction many have to CVS’s “merge-don’t-lock” way of working at first), and contrarily what seems attractive on paper may prove problematic in real use (the system may claim sophisticated merging, but will it actually do what you want given your version history?). Equally, however, trying to use every system in anger would impose a very serious cost: unless we write the infrastructure for every system we test, some live project will have to do without it while they try out the shiny new system, and for every system someone will have to undergo the considerable expense of really learning how to use it and make it behave well. So we have to find ways to at least thin the candidate list.

We first narrow the list to the six candidates mentioned in the previous post: Subversion, Monotone, darcs, Git, Bazaar, and Mercurial. All of these have a sizeable community behind them and are used by popular projects. This means they have demonstrated themselves fit for purpose, and that there is a community who will provide help if we encounter problems, and code to support integration with other pieces of software. Other candidates may have interesting properties, but to choose them would be to be relatively out on our own; their lack of popularity also increases the risk that they will simply be abandoned after we have invested in them. In particular, this eliminates Codeville, the innovative DVCS designed by BitTorrent inventor Bram Cohen, though there seems little reason to pick it up in any case now that its main selling point, a smart text merging algorithm, has been picked up by Bazaar and could later be supported by some of the other systems if it is found to be usefully superior.

Of the six, the non-distributed Subversion is the first to be thrown out. This isn’t because we expect to benefit greatly from the possibility of disconnected operation, though it may prove useful sometimes; it is because we would like the other features of DVCSes described in the last article, in particular history-aware merging, and the general cleanliness of the underlying model. It’s a difficult decision, because Subversion has by far the best tool support of all of our candidates, including a mature Eclipse plugin; however, this is a decision we need to make based on the long-term future, and we anticipate that if we can pick a system that will remain popular then such support is just a matter of waiting for the tools to catch up.

The remaining five are very hard to choose between; I’ve had a hard time even finding discussion of how to choose one, because most articles focus on how each one is better than CVS or Subversion rather than comparing them to their DVCS peers. All are licensed under the GPL.

Monotone is the oldest of the five remaining candidates, and the first that I took an interest in. It has an attractively clean model of how a DVCS should work, and is in many ways the “most decentralized” of the five, because of the way it handles authentication. In any other DVCS, if I pull from your repository or allow you to push to mine, I am implicitly trusting you as a source of good revisions that I might like to build on. In Monotone, revisions are cryptographically signed, and it is these signatures that decide which revisions I will pay attention to; as a result, Monotone servers exchange not assertions but facts, and you don’t have to go to a particular server to get “authoritative” information on which is the right revision.

However, these signatures represent an unsolved management headache: how do you decide which keys to trust? As things stand, everyone has to update their keyring when a new developer joins the project. In February of last year, I attended a week-long Monotone developer’s summit in San Francisco hosted by Google and my sole personal goal while there was to find a better solution; I met a great many very very smart people and we had some fascinating discusions around the idea of “policy branches” to solve this problem, but we were never able to agree on exactly how such branches should work and as far as I know the problem is still unsolved.

Experiments with using Monotone internally showed other problems. Monotone repositories have a single global lock, so if for example a repository is made available in a web interface you can’t commit to it at the same time, a problem we were able to work around only with some very nasty hacks using multiple repositories. The same problem makes email notification hooks difficult to write, with the additional constraint that they must be written in an obscure interpreted language called Lua, and if more than one hook is to be run for the same event, the programmer must handle this themselves. Monotone itself is written in an eclectic style of C++ that makes it very hard to hack on or even understand what is happening internally. Finally, Monotone tends to be slow in normal use. Overall, we didn’t find working with Monotone to be an enjoyable experience, and we started looking at other candidates.

darcs has its supporters in this office. It’s written in Haskell, the statically typed pure functional programming language which had a place on our “Language du jour” whiteboard for much more than a day. It has by far the best support for “cherry-picking” (pulling in a change to a branch without pulling in all the changes that led to it) thanks to its “algebra of patches” that underlies its operation. However, this model is also what puts me off about it: it is very hard for darcs to cleanly support binary files, for example, because they aren’t well expressed by patches, and patches underlie every part of darcs including the storage and network formats; the other DVCSs have binary storage and network formats and consider the line-oriented nature of files only at merge time. To embed the assumption that all files are line-oriented text files so deeply into the architecture of a DVCS seems to me like a wrong turn that it would be very hard to back out of, so I kept looking.

That leaves three: Git, Bazaar, and Mercurial. All three date from around 2005, when Larry McVoy withdrew the limited license grant on his proprietary BitKeeper DVCS and the Linux kernel had to find a replacement in a hurry, a disaster for kernel development that vividly demonstrated the short-sightedness of Linus’s policy of trying to pretend that software licences don’t matter. All three have been chosen by major projects: Git is used most famously by the Linux kernel, Bazaar by Ubuntu’s Launchpad development centre, and Mercurial by the Java and Mozilla projects. A full evaluation of all three would be a fantastically costly exercise, so we had to use more superficial characteristics to decide which one to explore next.

Git is Linus’s own creation, started (I’m told) when Linus learned that the lead Monotone dev was on holiday and wasn’t about to start hacking on Monotone to improve performance until his return. To be sure, Git has very impressive performance, but there are several areas of concern: git has over a hundred subcommands betraying a lack of focus in interface design, and Win32 support (essential for us) is poor. In the end I felt I didn’t have faith in Git’s technical direction; I got the feeling that it was too wedded to a worse-is-better philosophy in which performance is more important than a clean model. To us this meant that it would take reports of crippling performance problems from other systems before we’d reassess Git.

The choice between Bazaar and Mercurial was in some ways the most arbitrary. Both are in Python, and both have a strong supporting community with lots of extensions – these two are not unrelated, as the choice of Python as implementation language lowers the barriers to getting involved. Each has a comparison page about the other, cross-linked, indicating their relative strengths, and updated as each draws features and ideas from the other or shoots ahead in an area it was formerly behind. There have even been joint Bazaar/Mercurial summit meetings hosted by Canonical, which didn’t result in either project subsuming the other but a rapid cross-fertilization of ideas. In the end I chose based on my feel for which had the clearest architectural vision, and based on the choices other projects have made, in particular projects which I felt would be good at making good choices, such as Java and Coyotos, and other LShift developers agreed: the choice was Mercurial.

Since then we’ve used Mercurial in anger for several projects, and done quite a bit of infrastructure work, integrating Mercurial with other tools that we use and otherwise making it more useful to us. So how’s it been working out for us? We’ll cover that in Part Three…

by
Paul Crowley
on

Moving away from CVS

When LShift first started off in 2000, the only real option for mature, open source version control was CVS. We’ve used CVS for most of our projects since then, and gone on to develop a strong infrastructure for managing CVS-backed projects, including a web interface for viewing versions, a web-based searchable database for related CVS commits (“CVSzilla”) which infers transactions from multiple simultaneous commits, and integration with the Bugzilla bug tracker.

Today, there are many other options, and I’ll discuss six major alternatives here: Subversion, Monotone, darcs, Git, Bazaar, and Mercurial. They all aim to do better than CVS in a variety of ways; these include:

- Entire tree versioning: a version consists of a snapshot of an entire source tree, and a single change may affect many files. This is something our CVSzilla tool simulates for CVS, but it’s built in to modern systems, and it makes it easy to ask questions like “what was the most recent change to the source?”.

- Support for renames: you can rename a file without losing the connection with the history of the file before the rename. This is made conceptually much simpler by entire tree versioning.

- Cheap branching: creating branches is a low-cost operation even for large source trees. Not all version control systems offer this; I once worked with one where branching was so expensive that the downtime for creating a branch had to be a part of the project schedule.

- Explicit merging: the system can create a version which is the merge of two branches, including the changes in both and deferring conflicts to the user, and mark it as such in the version metadata.

- General removal of cruft: CVS is now over 20 years old and was one of the first open source systems of its kind, and the experience of two decades allows many opportunities to streamline and modernise.

These advantages hold for all of the major alternatives to CVS, and in particular they hold for Subversion, the oldest and one of the most popular. Subversion aims to be a “better CVS” and its CVS lineage is clearly shown in the way it thinks. In particular, it is based around a centralized development model – when you want to use the version control system, you connect to the central version control server. This once seemed like the obvious only way such a thing could work, but the future of version control is taking a very different direction.

In a distributed version control system (DVCS), a developer may have a local copy not only of their current version of the sources, but of the entire version database (the “repository”), and this local copy supports not only examining the history but also adding to it. When you wish to share your changes with others, you connect to a remote repository and push your changes to it, and you can similarly pull other people’s changes into yours. Why is this useful?

- Speed: everything except the push/pull operations are local operations, and with no network latency or bandwidth issues to contend with they can be much faster.

- Disconnected operation: you can do most version control operations while disconnected from the network, such as during a flight. This is one of the main original motivations of distributed version control, though today the typical developer spends so little time disconnected (even flights are getting wired now) that for many this isn’t the compelling advantage it once was.

- Open source branching: if I want to create a branch of an open source project hosted using a DVCS, I don’t have to either persuade the lead developers to give me commit access to their project or aggressively “fork” the project: I can create my own public repository which includes both their changes and mine, and any developers that are interested can pull from both of us.

A DVCS has to have excellent support for branching and merging. This is because in a distributed system, if you and I both check out the same repository and check in changes locally, there is no way to ensure that one of our versions is a successor of the other; we will have created a fork in the version tree. If it’s not easy to re-unite the version history and create a new version that includes both our changes, the project will quickly fall apart. That’s why all five popular DVCSes offer these features:

- A version DAG: a version may have more than one parent, and the metadata explicitly includes the DAG that relates versions to each other.

- History-aware merging: when two versions are to be merged, the history of both and the way they are related is taken into account.

By contrast, if you merge a development branch into the trunk in Subversion and then make further changes on the branch, you will not be able to merge these further changes into the trunk unless you calculate which changes are new and which the trunk has already seen, and merge in by hand only the novel changes. Actually, this may no longer be true; the Subversion developers were aware of this problem when I last checked over a year ago and may now have found a fix.

In a DVCS, history-aware merging must form a part of the design right from the start if the system is to be at all useful. Once you get used to working with a system that supports a version DAG with history-aware merging, all other ways of expressing the problem of version control seem like poor approximations to this more fundamental expression of what the true idea behind version control is. In particular, this supports new ways of working that allow the VCS to do more work for the developers:

- Commit-first: if you have made a change and so has another developer, you commit your change before you merge with theirs. Merges can go wrong – either the system or the human it asks can make the wrong choices – and having a permanent record of the pre-merge state can be a lifesaver.

- Branch-per-bug development: create a new branch for every bug/feature you are working on, and merge with the trunk at the last minute. This means that if work on some features is complete while others are half-done, a new version can be produced that includes only the completed features.

This last feature in particular offers an invaluable boost to our agile way of working, making it far more likely we can produce a working version of the software at the end of the timebox even if development on some nice-to-have features is still incomplete and would leave the software in a broken state.

This article lays out more than enough reasons to migrate away from CVS to a more advanced system. But such a migration imposes costs: we have to gain experience using the system, and provide the same infrastructure support we’ve written code to provide for CVS. Unless we want to incur that expense six more times, in order to move away from CVS we must choose which system we will migrate to and use for future projects. That wasn’t an easy decision, as I’ll get into in my next blog post on the subject.

Continued in Part 2: Choosing a new version control system.

by
Paul Crowley
on
24/04/08

Search

Categories

You are currently browsing the LShift Ltd. blog archives for April, 2008.

Feeds

Archives

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us