SpringSource, a division of VMware, Inc. today announced the acquisition by VMware of Rabbit Technologies, Ltd, a company set up by LShift and partners Monadic and CohesiveFT.
A worker pool is a very common pattern, and they exist in the standard libraries for many languages. The idea is simple: submit some sort of closure to a service which commits to running the closure in the future in some thread. Normally the work is shared out among many different threads and in the absence of anything fancier, one assumes a first-come-first-served queue of closures.
Erlang, with its light-weight process model is not a language which you would expect would require such an approach: processes are dirt cheap, and the scheduler maps processes onto threads when they are ready to be run — in many ways, the ErlangVM is a glorified implementation of a worker pool, only one that does pre-emption and other fancy features, in a very similar way to an OS kernel. However, we recently found in RabbitMQ a need for a worker pool. Read more…
People tend to like certain software packages to be scalable. This can have a number of different meanings but mostly it means that as you throw more work at the program, it may require some more resources, in terms of memory or CPU, but it nevertheless just keeps on working. Strangely enough, it’s fairly difficult to achieve this with finite resources. With things like memory, the classical hierarchy applies: as you use up more and more faster memory, you start to spill to slower memory — i.e. spilling to disk. The assumption tends to be that one always has enough disk space.
Other resources are even more limited, and are harder to manage. One of these is file descriptors. Read more…
In several applications, it’s very useful to be able to take messages out of one RabbitMQ broker, and insert them into another. Many people on our mailing list have being asking for such a shovel, and we’ve recently been able to devote some time to writing one. This takes the form of a plugin for Rabbit, and whilst it hasn’t been through QA just yet, we’re announcing it so people who would like to play and even suggest further features for inclusion can do so sooner rather than later.
The shovel is written on top of the Erlang client. It supports both direct and network connections to nodes, SSL support, the ability to declare resources on nodes it connects to, basic round-
robinrabbit balancing of both source and destination nodes, and allows you to configure many parameters controlling how messages are consumed from the source, and how they’re published to the destination. Multiple shovels can be specified, their statuses queried, and shovels can repeatedly reconnect to nodes in the event of failure.
The plugin is available from http://hg.rabbitmq.com/rabbitmq-shovel/, and is released under the MPL v1.1. There is a README included which contains full documentation. This is replicated below. Read more…
Tokyo Cabinet is a rather excellent key-value store, with the ability to write to disk in a sane way (i.e. not just repeatedly dumping the same data over and over again), operate in bounded memory, and go really fast. I like it a lot, and there’s a likelihood that there’ll be a RabbitMQ plugin fairly soon that’ll use Tokyo Cabinet to improve the new persister yet further. Toke is an Erlang linked-in driver that allows you to use Tokyo Cabinet from Erlang. Read more…
Today I was lucky enough to give a talk at the Skills Matter Functional Programming Exchange. I talked about resource management in RabbitMQ and how we’re improving this in upcoming versions of RabbitMQ. All the sessions were videotaped and it would seem that a podcast will be going up shortly. In the mean time you can have a look at the slides if you want to.
The attendance was really good and the talks well received. There was a good range of talks, from some very practical and pragmatic such as my own, to slightly more theoretical talks. It was great to see Haskell, Erlang and F# being discussed outside of a purely academic setting and great to see so many companies and organisations getting really interested in functional programming and coming along to see how other people were making the most of it.
The Park Bench session was also good fun, with a good range of questions and experience being demonstrated by all. A good, fun atmosphere, and I’m sure all enjoyed the day.
The new persister that is being developed for RabbitMQ is nearing completion and is currently working its way through code review and QA. It’s being pretty thoroughly tested and generally stressed to see what could go wrong. One of the issues that we’ve come across in the past has to do with Erlang’s garbage collector: indeed there’s code in at least one area of RabbitMQ written in a specific (and non-obvious) way in order to work around issues with Erlang’s garbage collection of binary data.
We had noticed in the release notes for Erlang R13B03, that it mentions improvements to the garbage collector, and today when testing with both R13B02 and R13B03, we noticed substantial improvements with R13B03. The new persister is able to send out to disk partial queues. Thus a queue can have a mix of messages – some just in RAM, some just on disk, and some somewhere in between. This is separate from whether or not a message is marked persistent. The proportion pushed out to disk varies smoothly with the amount of RAM left available to Erlang: the idea is to avoid flooding the disk with enormous amounts of write requests which would potentially stall the queue, and cause blockages elsewhere in RabbitMQ.
The test I’d written used the Erlang experimental client. It had one channel, it created a queue, consumed from the queue, set QoS prefetch count to 10, and then went into a loop. In this loop, it would publish two 1KB messages, then receive 1 message, and acknowledge it. This way the queue would always grow, and memory would be fairly fragmented (the gap from the head of the queue to the tail of the queue would increase steadily as the head is moving forwards at twice the rate of the tail). With no memory limit, I saw the following (I manually killed this after the queue grew to just over 350,000 messages long (which means 700,000 publishes, and 350,000 acknowledgements)):
Note that for R13B03, the garbage collector is much more active, and in general memory usage is certainly more fine-grained. In this test, all the messages were always in RAM, no messages were pushed out to disk. Flat-size refers to the value returned by pushing the queue state through erts-debug:flat-size/1 which returns the amount of memory used by the data structure.
Next, I imposed a limit of about 200MB and did the same test. With R13B02, it got stuck after just over 260,000 messages: it was no longer able to reclaim any further space, and so flow-control kicked in and stopped the publisher, game over. With R13B03 it soldiered merrily on – I ended up manually killing it somewhere past the 1million message mark as I was getting bored. It’s also very clear to see how with R13B03, it successfully kicks down to pushing all the messages out to disk (which is why the size of the state suddenly gets very small – the memory growth from there on is due to an ets table). That’s certainly still possible with R13B02, and I have seen that happen, but there’s much greater risk, as seen here, of it getting stuck before that happens.
In short, the garbage collector in R13B03 seems a solid improvement. Even if you’re not using the experimental new persister, I suspect you’ll gain from upgrading to R13B03. And yes, that really is 1-million 1KB messages successfully sent into a queue using under 200MB of RAM.
I have recently been modifying the WireIt code to allow collapsing of multiple containers down into 1 composite container.
A quick summary of WireIt (from their site)
I got started on this when I was following on from Jonathan Lister’s work on using WireIt to create a non-technical user interface for Rabbit Streams. It was soon apparent that if you wanted to use this in practice you would end up with huge graphs with many containers and many wires going all over the place. Once above about 10 containers or so with a couple of wires per container it gets hard to see what is happening. To solve this problem some kind of abstraction is needed, in this case collapsing multiple containers down into 1 composite container.
A collapsed group of containers essentially boils down to a single container representing a sub graph. So I started by looking at the JSON object produced when you save a complete WireIt graph and also how that JSON is then expanded into the complete graph on loading. Once identified it was relatively easy to implement those function for only a sub set of the whole graph.
The next challenge was maintaining the wires across a collapse/expand. The first step was to show the terminals from grouped containers on the composite container. Naming conflicts had to be handled (in the case of two duplicate internal terminals from different containers) which led to the creation of a map from external Terminals to internal Containers and their terminals. With this map I could iterate over the wires attached to terminals in the group and create copies of those wires going to the external terminals. Then on expansion an wires connected to the composite container were again mapped back to their counter parts. The same was done for fields.
In combination with the ability to manually select which fields and terminals are visible and manually set their names you can easily represent encapsulation.