technology from back to front

PubSub-over-Webhooks with RabbitHub

RabbitHub is our implementation of PubSubHubBub, a straightforward pubsub layer on top of plain old HTTP POST — pubsub over Webhooks. It’s not well documented yet (understatement), but that will change.

It gives every AMQP exchange and queue hosted by a RabbitMQ broker a couple of URLs: one to use for delivering messages to the exchange or queue, and one to use to subscribe to messages forwarded on by the exchange or queue. You subscribe with a callback URL, so when messages arrive, RabbitHub POSTs them on to your callback. For example,

(The symmetrical …/subscribe/x/… and …/endpoint/q/… also exist.)

The PubSubHubBub protocol specifies some RESTful(ish) operations for establishing subscriptions between message sources (a.k.a “topics”) and message sinks. RabbitHub implements these operations as well as a few more for RESTfully creating and deleting exchanges and queues.

Combining RabbitHub with the AMQP protocol implemented by RabbitMQ itself and with the other adapters and gateways that form part of the RabbitMQ universe lets you send messages across different kinds of message networks — for example, our public RabbitMQ instance, dev.rabbitmq.com, has RabbitHub running as well as the standard AMQP adapter, the rabbitmq-xmpp plugin, and a bunch of our other experimental stuff, so you can do things like this:

RabbitHub example configuration

  • become XMPP friends with pshb@dev.rabbitmq.com (the XMPP adapter gives each exchange a JID of its own)

  • use PubSubHubBub to subscribe the sink http://dev.rabbitmq.com/rabbithub/endpoint/x/pshb to some PubSubHubBub source — perhaps one on the public Google PSHB instance. (Note how the given URL ends in “x/pshb”, meaning the “pshb” exchange — which lines up with the JID we just became XMPP friends with.)

  • wait for changes to be signalled by Google’s PSHB hub to RabbitHub

  • when they are, you get an XMPP IM from pshb@dev.rabbitmq.com with the Atom XML that the hub sent out as the body

RabbitHub is content-agnostic — you don’t have to send Atom around — so the fact that Atom appears is an artifact of what Google’s public PSHB instance is mailing out, rather than anything intrinsic in pubsub-over-webhooks.

We’ve also been experimenting with using http://www.reversehttp.net/ to run a PubSubHubBub endpoint in a webpage — see for instance http://www.reversehttp.net/demos/endpoint.html and its associated Javascript for a simple prototype of the idea. I’m playing with building a simple PSHB hub in Javascript using the same tools.

by
tonyg
on
30/06/09

ICFP Contest 2009

What is fast becoming a regular fixture in my diary is my entry with a few friends into the ICFP Programming Contest each year. This is a three day programming competition in which you can write in any language to solve the problems given. The competition is still in progress, though my team’s decided to stop — we’ve had enough fun for one year, and there’s only so far you can go with very little sleep and shockingly poor maths — but we’ve done much better than last year, learning from our previous mistakes… Read more…

by
matthew
on
29/06/09

Python Queue interface for AMQP

Here at LShift we’re often discussing RabbitMQ. We’re keen about complicated deployment scenarios, redundancy of the broker and other complex use cases. While these problems are extremely interesting, some believe they are irrelevant for a great majority of RabbittMQ users.

People keep asking how to get started with Rabbit. There are some very good sources however, understanding the AMQP abstractions requires some time.

Having that in mind I was astonished when I’ve seen this code, where Brian wraps AMQP code with a very simple Queue-like interface. This reminded me that messaging can be trivial and intuitive. In some environments, a queue is exactly what you need from messaging.

Read more…

by
marek
on
11/06/09

Memcached protocol is not enough


source



Memcached protocol is not enough

A few months ago I was wondering if it’s feasible to build a scalable realtime search engine using shared-nothing architecture. One of the essential project decisions I need to make, is to choose a decent communication protocol to storage nodes. Recently, the memcached protocol is becoming a standard as a key-value protocol. It’s not only used by a memcached cache-server, but it has also been adopted in persistent key-value databases like Tokyo-Tyrant, LightCloud or MemcacheDB. However there are several things that make this protocol a very bad choice for a persistent database.

Read more…

by
marek
on
21/05/09

Untangling the BBC’s data feeds

Recently, Alan Ogilvie from A&Mi at the BBC announced that they were developing a “Feeds Hub”, and outlined their ambitions for it.

He also mentioned LShift, RabbitMQ and open source, and I would like to explain, from our point of view, what this project is and how we’re working with the BBC.

What is a “Feeds Hub”?

Alan describes the central problem they want to solve:

The number of new projects across the BBC starting to use feeds in creative ways is growing very quickly - just think of spaghetti… on a massive scale. So what do we do? What are the options? We could go down the route of gathering together a centralised ‘Feed Usage’ committee with members across the BBC, to ‘federate’ feeds so that they are all produced in the same way but, in practice, this never truly works and is likely to stifle creativity. Often it is quite difficult to convince people to work together when they have already experienced the freedom of doing what they want - often they are concerned that their projects will be delayed. Not all feeds sources that we use or want to use are under our control, things like Twitter, Flickr, blogs, etc. Federation will never solve all our problems anyway - for example, it can’t help when a source feed is turned off, it doesn’t monitor failures.

The idea is, then, is to bring the spaghetti under control; not by mandating things be done a certain way, but by overlaying a bunch of management and monitoring tools that would otherwise be ad-hoc or not exist.

We also want to enable people to discover, reuse and adapt existing feeds, rather than reinvent them. Again, not by enforcement, but by making it easier to do so than to not.

And we’re not just talking about RSS — there are (at the BBC and in general) many different protocols and formats flying about.

Technically-speaking, this adds up to a couple of pieces of kit: a platform for relaying feeds through, that supports routing, transformation and distribution by a number of different means; and, a user interface for discovering, creating, managing and monitoring these feeds.

How are LShift involved?

In short: LShift are developing the core technology, helping the BBC shepherd the various strands of the project along, and helping engage with developers to build the open source aspect of the project (about which more in a bit).

LShift are the progenitors of RabbitMQ, a message broker implementing AMQP. Over the last few years we’ve been thinking about and experimenting with different applications of messaging (and not just AMQP); for example, Rabbiter, which puts a Twitter-like spin on XMPP.

In the meantime, RabbitMQ itself has gained client libraries, gateways, adapters, and a smart, active community, to the point where it’s no longer just an AMQP message broker — it’s becoming more like a universal messaging adapter.

So we were very enthused when we heard that the BBC wanted a feeds hub, because it seemed to bring together lots of what we’d been thinking about abstractly, as well as new ideas and problems to solve, and give it all a concrete purpose.

When and how will it be open source?

We’re working on a prototype, and our plan is to make the source public as soon as it’s fit for consumption. We hope this will be in the next month.

In the meantime, I may talk about some of the core technical ideas, and our plans, here on our blog; and, of course, you can follow LShift on Twitter and the Radiolabs blog.

by
mikeb
on
08/05/09

Cranial Surgery: Giving Rabbit more Memory

Many users of Rabbit have been asking us about how Rabbit copes with many large messages in queues, to the extent that the total size of these messages exhausts the available physical memory (RAM). As things stand at the moment, the answer is not very well. Although we have a persistence mechanism, that is not quite an answer either because whilst it does ensure that messages are written to disk, it does not remove messages from RAM. So, we’ve been looking at writing a disk-based queue so that should RAM become tight, we can start to push messages out to disk and collect them later from there.

However, there is this thing called swap, and it seems wise to test how Rabbit copes when we just allow it to expand into swap. Read more…

by
matthew
on
02/04/09

LShift FPGA club

A bunch of us at LShift recently discovered a shared interest in FPGAs. These devices are reconfigurable hardware: chips that can be programmed to act like any arrangement of digital logic gates, including designs as large as general purpose processors. High-end FPGAs still cost thousands of dollars, but low-end development boards are relatively affordable, and due to the march of Moore’s law, they are now capable enough for potentially interesting applications. This means that the same kind of economic structure that allows open source software development to thrive now applies to the world of hardware design. Students, enthusiasts and professional engineers publish their projects on the Internet, and communities have formed around sites such as OpenCores.

Obstacles remain for those from a software background who wish to learn about FPGAs. Many of these are cultural: There is an overlap between the concepts used in hardware design and those of software construction, but often those superficial similarities conceal significant differences in the basic approach. And some areas that are considered basic knowledge for electronic engineers will be unfamiliar to most software developers.

But there’s nothing we like more than jumping into a previously unfamiliar area. So we have acquired an FPGA development board to play with. This board has a Xilinx FPGA, together with other bits and pieces including a 32MB DRAM chip, various Flash chips, an Ethernet port, a couple of serial ports, and a small LCD display. One of the attractions of this board is that Xilinx makes a full version of their ISE development environment free to download, including the Linux version. It’s not open source, but in the world of FPGA development environments, this is as good as it currently gets; FPGA vendors consider some details of their devices to be confidential, and so steps in the FPGA programming process necessarily involve proprietary tools.

We are still finding our feet with VHDL and Verilog, and understanding what is and isn’t feasible in the scope of an evenings-and-weekends hardware design project. The Xilinx development environment is not as polished or easy to get started with as modern software development environments. But we have already made some modest progress:

by
David Wragg
on
18/03/09

Yahoo doesn’t know what an email address is

Many websites refuse to accept email addresses of the form myusername+sometext@gmail.com, despite the fact that the +sometext is perfectly legitimate1 and is an advertised feature gmail offers for creating pseudo-single-use email addresses from a base email address.

My guess is that the developers of these sites think, because they’re either lazy or incompetent, that email addresses have more restrictions than they in fact have. It’s reasonable (and fairly easy) these days to check the syntax of the DNS part of an email address, because few people use non-DNS or non-SMTP transfer methods anymore, but the mailbox part is extremely flexible and hard to check accurately. A sane thing to do is just trust the user, and send a test mail to validate the address.

I picked on Yahoo in the title of this post: Yahoo are by no means the only offender, but I just signed up for a yahoo account, so they’re for me the most recent. Their signup form also refused to provide any guidance about why they were rejecting the form submission: I had to use my previous experience of sites wrongly rejecting valid email addresses to guess what the problem might be. Fail.


Footnote 1: According to my best reading of the relevant RFCs, anyway. See the definition of dot-atom in section 3.2.4 of RFC 2822, referenced in this context by section 3.4.1.

by
tonyg
on
17/03/09

OpenAMQ’s JMS client with RabbitMQ server

OpenAMQ has released their JMS client for using JMS with AMQP-supporting brokers. This afternoon I experimented with getting it running with RabbitMQ.

After a simple, small patch to the JMS client code, to make it work with the AMQP 0-8 spec that RabbitMQ implements (rather than the 0-9 spec that OpenAMQ implements), the basic examples shipped with the JMS client library seemed to work fine. The devil is no doubt in the details, but no problems leapt out at me.

To get it going, I checked it out using Git (git clone git://github.com/pieterh/openamq-jms.git). Compilation was as simple as running ant. Kudos to the OpenAMQ team for making the build process so smooth! (Not to mention writing a great piece of software :-) )

The changes to make it work with AMQP 0-8 were:

  • retrieving the 0-8 specification XML

  • changing the JMS client library’s build.xml file to point to the downloaded file in its generate.spec variable

  • changing one line of code in src/org/openamq/client/AMQSession.java: in 0-8, the final null argument to BasicConsumeBody.createAMQFrame must be omitted

  • re-running the ant build

After this, and creating a /test virtual-host using RabbitMQ’s rabbitmqctl program, the OpenAMQ JMS client examples worked fine, as far as I could tell.

rabbitmqctl add_vhost /test
rabbitmqctl set_permissions -p /test guest '.*' '.*' '.*'

You can download the patch file I applied to try it yourself. Note that you’ll need to put the correct location to your downloaded amqp0-8.xml file into build.xml.

by
tonyg
on
16/03/09

LShift at QCon!

LShift will be attending QCon London!

Please come over and meet us at stand 20 during the conference, from March 11th to 13th.

I will also be presenting Etherpad clone at Skillsmatter stand (booth number 10). This will happen in break between sessions on Wednesday at 4:45 pm.

by
marek
on
10/03/09

Older Posts »

2000-9 LShift Ltd, 1st Floor Office, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK +44 (0)20 7729 7060