Some time ago we got an interesting bug report for RabbitMQ. Surprisingly, unlike other complex bugs, this one is easy to describe:
At some point basic.get suddenly starts being very slow - about 9 times slower!
In several applications, it’s very useful to be able to take messages out of one RabbitMQ broker, and insert them into another. Many people on our mailing list have being asking for such a shovel, and we’ve recently been able to devote some time to writing one. This takes the form of a plugin for Rabbit, and whilst it hasn’t been through QA just yet, we’re announcing it so people who would like to play and even suggest further features for inclusion can do so sooner rather than later.
The shovel is written on top of the Erlang client. It supports both direct and network connections to nodes, SSL support, the ability to declare resources on nodes it connects to, basic round-robinrabbit balancing of both source and destination nodes, and allows you to configure many parameters controlling how messages are consumed from the source, and how they’re published to the destination. Multiple shovels can be specified, their statuses queried, and shovels can repeatedly reconnect to nodes in the event of failure.
The plugin is available from http://hg.rabbitmq.com/rabbitmq-shovel/, and is released under the MPL v1.1. There is a README included which contains full documentation. This is replicated below. (more…)
An obvious extension point for an AMQP broker is the addition of new types of exchange. An exchange type essentially represents an algorithm for dispatching messages to queues, usually based on the message’s routing key, given how the queues are bound to the exchange — it’s a message routing algorithm.
At a minimum, supporting new exchange types requires only some scaffolding to plug in to (an exchange type registry) and a hook for routing messages. However, this wouldn’t support some more interesting use cases, and in particular it didn’t support our motivating use case. Exchange types that want to keep their own state need to be initialised, and be notified about other lifecycle events. (more…)
Tokyo Cabinet is a rather excellent key-value store, with the ability to write to disk in a sane way (i.e. not just repeatedly dumping the same data over and over again), operate in bounded memory, and go really fast. I like it a lot, and there’s a likelihood that there’ll be a RabbitMQ plugin fairly soon that’ll use Tokyo Cabinet to improve the new persister yet further. Toke is an Erlang linked-in driver that allows you to use Tokyo Cabinet from Erlang. (more…)
Today I was lucky enough to give a talk at the Skills Matter Functional Programming Exchange. I talked about resource management in RabbitMQ and how we’re improving this in upcoming versions of RabbitMQ. All the sessions were videotaped and it would seem that a podcast will be going up shortly. In the mean time you can have a look at the slides if you want to.
The attendance was really good and the talks well received. There was a good range of talks, from some very practical and pragmatic such as my own, to slightly more theoretical talks. It was great to see Haskell, Erlang and F# being discussed outside of a purely academic setting and great to see so many companies and organisations getting really interested in functional programming and coming along to see how other people were making the most of it.
The Park Bench session was also good fun, with a good range of questions and experience being demonstrated by all. A good, fun atmosphere, and I’m sure all enjoyed the day.
mercurial-server gives your developers remote read/write access to centralized Mercurial repositories using SSH public key authentication; it provides convenient and fine-grained key management and access control.
If you’ve never programmed a computer, you should. There’s nothing like it in the whole world. When you program a computer, it does exactly what you tell it to do. It’s like designing a machine — any machine, like a car, like a faucet, like a gas-hinge for a door — using math and instructions. It’s awesome in the truest sense: it can fill you with awe.–Cory Doctorow, Little BrotherA computer is the most complicated machine you’ll ever use. It’s made of billions of micro-miniaturized transistors that can be configured to run any program you can imagine. But when you sit down at the keyboard and write a line of code, those transistors do what you tell them to.
Most of us will never build a car. Pretty much none of us will ever create an aviation system. Design a building. Lay out a city.
Those are complicated machines, those things, and they’re off-limits to the likes of you and me. But a computer is like, ten times more complicated, and it will dance to any tune you play. You can learn to write simple code in an afternoon. Start with a language like Python, which was written to give non-programmers an easier way to make the machine dance to their tune. Even if you only write code for one day, one afternoon, you have to do it. Computers can control you or they can lighten your work — if you want to be in charge of your machines, you have to learn to write code.
Partly inspired by that paragraph, this summer’s craze for me is teaching my friends to program; so far five different people have let me start them off on the basics. I don’t necessarily hope to make professional programmers or dedicated hobbyists out of them all; I have a couple of more modest goals:
I started when a friend who is doing a creative writing degree said something like “But don’t you have to learn a lot of theory before you can start to program?” I grabbed the laptop, opened an editor window and typed
print "Hello world"
“That is a complete computer program” I said. “Can you guess what it does?”
From there it was easy to move on to a succession of slightly more complicated programs, demonstrating each of the major building blocks of programming languages in turn. I’m not sure she’s going to ask for a second lesson, but I think she left the first with the feeling that what I do for a living, or where the software she uses comes from, is that bit less mysterious.
However, programs that print things on a terminal aren’t necessarily the most fun or engaging things to play with - I tried relying on that for a second, more hands-on lesson with another friend, and I think the fun vs frustration ratio left something to be desired. The usual solution to this is turtle graphics as pioneered in the programming language Logo, but I found it hard to think of any interesting programs to write with the turtle - it seems as though to get a lot out of it, you have to really like geometry, which few of us do.
Much more appealing is the grid-based world of Karel the robot. While the turtle wanders around the barren infinite plane with only its own lines for company, Karel’s world includes impassable walls aligned with the grid lines and objects called “beepers” in the grid squares that the robot can detect, pick up and put down. This makes it natural to write programs that do more than follow a predetermined sequence of steps - programs that actually discover the state of the world they find themselves in and adjust how they act appropriately.
These ideas have been picked up in other tools, most notably the teaching environment Guido van Robot. However, in common with its predecessor, these environments teach their own custom language: once a student is ready for greater things, they must start learning a new programming language. I’m no pro at these things, but I’m inclined to see this as a waste; the syntax of Python is easily simple enough for a beginner to start with right away, and I think it’s a powerful selling point and confidence booster for the student if you can honestly say that the language that they are now learning is used by choice at Google and all over the world by real programmers with real jobs to do.
So this finally led me to my favourite tool so far: RUR-PLE.
RUR-PLE is exactly this synthesis: it provides the maze-like environment of Karel and Guido van Robot, with the syntax of the real Python programming language. RUR-PLE ships with a range of pre-designed mazes for various programming challenges that introduce the key ideas of the Python programming language in a fun way, as well as a complete HTML manual for learning programming.
And it really works - people found it intuitive and rewarding to move the robot, and it was easy to come up with maze-based challenges that illustrated programming concepts. The simple integrated debugger meant that whenever it did something unexpected, it was easy to take it through step by step to see why it was doing what it did, and the robot graphics provided satisfying instant results. I knew I was doing something right when one friend practically ran back to the computer after dinner to finish debugging her maze solving application.
However, there are a few problems with RUR-PLE. The user interface is in many ways unnecessarily complicated and unintuitive. The program provides four “tabs” with four different functions that to me would be better provided as four separate applications, especially since one is a browser window with a lesson plan where most users would prefer to use the browser they normally use. There is no menu bar, so everything has to be done with buttons on the toolbar. This toolbar is written in a non-standard way that is less attractive than platform standard ways of doing these things. The robot is drawn in elevation even though the maze is seen in plan, which makes life harder for users who are already tripping up on telling their left from their right, and the squares of the grid are at right angles to the walls instead of lined up with them, just one of the unnecessary visual complications in the grid; another complication is that the Python editor is a full folding editor, useful for experienced programmers but another potential source of confusion for beginners. Users are instructed to call a “turn_off()” instruction at the end of every program, a rather unPythonic way of doing things that makes the simplest working program twice as long as it needs to be, and which turns out to be a hangover from its Karel-based roots - as is the rather confusing name of “beepers” for the objects that the robot can pick up and put down.
Since it’s in Python, I thought these problems (and others, such as that you can’t change the speed at which the program runs once it’s running) would be easy to fix, but as I started digging into the code I discovered that the structure made fixing these things hard. As I learned later, the author wrote it as part of an exercise in teaching himself Python programming, and it shows: the code is in many places more verbose than it needs to be, and the abstraction that would allow me to restructure the UI while keeping the core functions in place was absent.
So I did what one should never do: I started a complete rewrite, now called “Rurple NG” for “Next Generation”. And I’m pleased to report that so far I’m getting very nice results. There’s still a certain amount of work left to do, as the TODO file in the sources shows. But it works, and it’s good enough to use for teaching purposes now. I encourage you all to download the sources and help me hack on it or help write documentation, or simply try it and give me whatever advice you can on how this can be made better.
I think most people imagine that programming is far beyond their abilities, but I believe learning to program is one of those things that nearly everyone can and should do; it’s directly beneficial even if you only ever write tiny programs to do what your existing tools can’t quite do, but it also has tremendous indirect benefits in understanding not only computers but mathematics and logic, as well as the special discipline of debugging. I hope that I can create an attractive and friendly tool that could bring programming to a new audience, and help open up our mysterious world just a little bit.
Many users of Rabbit have been asking us about how Rabbit copes with many large messages in queues, to the extent that the total size of these messages exhausts the available physical memory (RAM). As things stand at the moment, the answer is not very well. Although we have a persistence mechanism, that is not quite an answer either because whilst it does ensure that messages are written to disk, it does not remove messages from RAM. So, we’ve been looking at writing a disk-based queue so that should RAM become tight, we can start to push messages out to disk and collect them later from there.
However, there is this thing called swap, and it seems wise to test how Rabbit copes when we just allow it to expand into swap. (more…)
Erlang/OTP’s global module helps with atomic assignment of names for processes in a distributed Erlang cluster. It makes sure that only a single process at a time holds any given name, across all connected nodes. Unlike the local name registration function, names aren’t limited to being atoms: with global, they can be any term at all.
To see global’s conflict-resolution in action, we need to register a name on two nodes not initially connected, and then make them aware of each other. The system will pick one registration to survive, and will terminate the other registration.
First, register the name “a” on each of two nodes (started with erl -sname one and erl -sname two, respectively). On node one:
Eshell V5.6.2 (abort with ^G) (one@walk)1> global:register_name(a, self()). yes (one@walk)2> global:whereis_name(a). <0.37.0>
We see that the name was registered successfully (the call to register_name returned yes), and that when looked up, a pid (the pid of the shell process) is returned, as we would expect. Now, the same on node two:
Eshell V5.6.2 (abort with ^G) (two@walk)1> global:register_name(a, self()). yes (two@walk)2> global:whereis_name(a). <0.37.0>
Again, we see it succeeding. Note that each node has successfully registered the “global” name “a”. This is because they are unaware of each other. Once they’re connected, Erlang/OTP will automatically resolve the situation. By default, it does this by terminating one of the two contending processes.
Let’s see what happens. Connect the two nodes together, by pinging one from the other — here, pinging node two from node one:
(one@walk)3> net_adm:ping(two@walk).
pong
(one@walk)4>
=INFO REPORT==== 13-Feb-2009::03:05:22 ===
global: Name conflict terminating {a,<5744.37.0>}
(one@walk)4> global:whereis_name(a).
<0.37.0>
(one@walk)5>
See that the termination of one of the contenders is reported with a message in the system log. It was the registration on node two that was terminated, and the registration on node one that survived. Here’s what we see on node two:
** exception error: killed (two@walk)3> global:whereis_name(a). <5768.37.0> (two@walk)4> node(global:whereis_name(a)). one@walk
Node two’s registered process has been killed. When we then ask about the registration for the name “a”, we see a pid from node one.
Finally, we’ll try registering the name for a second time:
(two@walk)5> global:register_name(a, self()). no (two@walk)6>
It answers no because there’s already a registration that it knows about in the system. The same no answer would have been returned if we’d tried the same thing on node one instead.
Recently, as part of RabbitMQ server development, we ran into an interesting issue regarding Erlang’s per-process garbage collection. If a process is idle — not doing any work at all, simply waiting for an external event — then its garbage-collector will not run until it starts working again. The solution is to hibernate idle processes, which causes a very aggressive garbage-collection run and puts the process into a suspended state, from which it will wake when a message next arrives.
We were implementing a form of flow control in RabbitMQ server, essentially equivalent to XON/XOFF, and when a channel was paused, we found that its memory usage wasn’t decreasing in proportion to its decreased activity. As soon as we started hibernating processes, the memory usage dropped right back.
You are currently browsing the archives for the Erlang category.