Many users of Rabbit have been asking us about how Rabbit copes with many large messages in queues, to the extent that the total size of these messages exhausts the available physical memory (RAM). As things stand at the moment, the answer is not very well. Although we have a persistence mechanism, that is not quite an answer either because whilst it does ensure that messages are written to disk, it does not remove messages from RAM. So, we’ve been looking at writing a disk-based queue so that should RAM become tight, we can start to push messages out to disk and collect them later from there.
However, there is this thing called swap, and it seems wise to test how Rabbit copes when we just allow it to expand into swap. (more…)
Erlang/OTP’s global module helps with atomic assignment of names for processes in a distributed Erlang cluster. It makes sure that only a single process at a time holds any given name, across all connected nodes. Unlike the local name registration function, names aren’t limited to being atoms: with global, they can be any term at all.
To see global’s conflict-resolution in action, we need to register a name on two nodes not initially connected, and then make them aware of each other. The system will pick one registration to survive, and will terminate the other registration.
First, register the name “a” on each of two nodes (started with erl -sname one and erl -sname two, respectively). On node one:
Eshell V5.6.2 (abort with ^G) (one@walk)1> global:register_name(a, self()). yes (one@walk)2> global:whereis_name(a). <0.37.0>
We see that the name was registered successfully (the call to register_name returned yes), and that when looked up, a pid (the pid of the shell process) is returned, as we would expect. Now, the same on node two:
Eshell V5.6.2 (abort with ^G) (two@walk)1> global:register_name(a, self()). yes (two@walk)2> global:whereis_name(a). <0.37.0>
Again, we see it succeeding. Note that each node has successfully registered the “global” name “a”. This is because they are unaware of each other. Once they’re connected, Erlang/OTP will automatically resolve the situation. By default, it does this by terminating one of the two contending processes.
Let’s see what happens. Connect the two nodes together, by pinging one from the other — here, pinging node two from node one:
(one@walk)3> net_adm:ping(two@walk).
pong
(one@walk)4>
=INFO REPORT==== 13-Feb-2009::03:05:22 ===
global: Name conflict terminating {a,<5744.37.0>}
(one@walk)4> global:whereis_name(a).
<0.37.0>
(one@walk)5>
See that the termination of one of the contenders is reported with a message in the system log. It was the registration on node two that was terminated, and the registration on node one that survived. Here’s what we see on node two:
** exception error: killed (two@walk)3> global:whereis_name(a). <5768.37.0> (two@walk)4> node(global:whereis_name(a)). one@walk
Node two’s registered process has been killed. When we then ask about the registration for the name “a”, we see a pid from node one.
Finally, we’ll try registering the name for a second time:
(two@walk)5> global:register_name(a, self()). no (two@walk)6>
It answers no because there’s already a registration that it knows about in the system. The same no answer would have been returned if we’d tried the same thing on node one instead.
Recently, as part of RabbitMQ server development, we ran into an interesting issue regarding Erlang’s per-process garbage collection. If a process is idle — not doing any work at all, simply waiting for an external event — then its garbage-collector will not run until it starts working again. The solution is to hibernate idle processes, which causes a very aggressive garbage-collection run and puts the process into a suspended state, from which it will wake when a message next arrives.
We were implementing a form of flow control in RabbitMQ server, essentially equivalent to XON/XOFF, and when a channel was paused, we found that its memory usage wasn’t decreasing in proportion to its decreased activity. As soon as we started hibernating processes, the memory usage dropped right back.
I’m pleased to announce that our XMPP gateway for exposing a RabbitMQ instance to the global XMPP network has been released (documentation, browse or check out code, download snapshot).
Update: Because it depends on a newer release of RabbitMQ than 1.3.0, you will also need to check out the server and codegen code from our public mercurial repositories, or download them as snapshots: server, codegen.

The mod_rabbitmq module implements an ejabberd extension module which gateways AMQP (as implemented by RabbitMQ) to XMPP.
By bridging between the two systems, we benefit from:
The current implementation is a very simple mapping between the two systems. Its simplicity keeps the code short, but only exposes a subset of AMQP features to the XMPP network, and vice versa.
Upon browsing the source to the excellent MochiWeb, I came across a call to a function that, when I looked, wasn’t defined anywhere. This, it turns out, was a clue: Erlang has undocumented syntactic support for late-bound method dispatch, i.e. lightweight object-oriented programming!
The following example, myclass.erl, is a parameterized module, a feature that arrived undocumented in a recent Erlang release. Parameterized modules are explored on the ‘net here and here. (The latter link is to a presentation that also covers an even more experimental module-based inheritance mechanism.)
-module(myclass, [Instvar1, Instvar2]). -export([getInstvar1/0, getInstvar2/0]). getInstvar1() -> Instvar1. getInstvar2() -> Instvar2.
“Instances” of the “class” called myclass can be created with myclass:new(A, B) (which is automatically provided by the compiler, and does not appear in the source code), where A and B become values for the variables Instvar1 and Instvar2, which are implicitly scoped across the entirety of the myclass module body, available to all functions defined within it.
The result of a call to a new method is a simple tuple, much like a record, with the module name in the first position, and the instance variable values in order following it.
Eshell V5.6 (abort with ^G)
1> Handle = myclass:new(123, 234).
{myclass,123,234}
2> Handle:getInstvar1().
123
3> Handle:getInstvar2().
234
While this looks really similar to OO dispatch in other languages, it’s actually an extension to Erlang’s regular function call syntax, and works with other variations on that syntax, too:
4> {myclass,123,234}:getInstvar1().
123
The objects that this system provides are pure-functional objects, which is unusual: many object-oriented languages don’t clearly separate the two orthogonal features of late-binding and mutable state. A well-designed language should let you use one without the other, just as Erlang does here: in Erlang, using parameterized modules for method dispatch doesn’t change the way the usual mechanisms for managing mutable state are used. “Instance variables” of parameterized modules are always immutable, and regular state-threading has to be used to get the effects of mutable state.
I’d like to see this feature promoted to first-class, documented, supported status, and I’d also very much like to see it used to structure the standard library. Unfortunately, it’s not yet very well integrated with existing modules like gb_sets, ordsets and sets. For example, here’s what happens when you try it with a simple lists call:
5> lists:append([1, 2], [3, 4]).
[1,2,3,4]
6> {lists, [1, 2]}:append([3, 4]).
[3,4|{lists,[1,2]}]
Not exactly what we were after. (Although it does give brittle insight into the current internals of the rewrites the system performs: a {foo, ...}:bar(zot) call is translated into foo:bar(zot, {foo, ...}) - that is, the this parameter is placed last in the argument lists.)
We’ve been investigating the possibility of an XPath-based routing extension to RabbitMQ, where XPath would be used as binding patterns, and the message structure would be exposed as XML infoset. As part of this work, we’ve been looking at Erlang’s XPath implementation that comes as part of the built-in xmerl library.
Here are a couple of examples of Erlang’s XPath in action. First, let’s parse a document to be queried:
{ParsedDocumentRootElement, _RemainingText = ""} =
xmerl_scan:string("<foo>" ++
"<myelement myattribute=\"red\">x</myelement>" ++
"<myelement myattribute=\"blue\">x</myelement>" ++
"<myelement myattribute=\"blue\">y</myelement>" ++
"</foo>").
(We could have used xmerl_scan:file to read from an external file, instead of xmerl_scan:string, if we’d wanted to.)
Next, let’s retrieve the contents of every myelement node that contains text exactly matching “x”:
69> xmerl_xpath:string("//myelement[. = 'x']/text()”,
ParsedDocumentRootElement).
[#xmlText{parents = [{myelement,1},{foo,1}],
pos = 1,
language = [],
value = “x”,
type = text},
#xmlText{parents = [{myelement,2},{foo,1}],
pos = 1,
language = [],
value = “x”,
type = text}]
Notice that it’s returned two XML text nodes, and that the “parents” elements differ, corresponding to the different paths through the source document to the matching nodes.
Next, let’s search for all myelements that have a myattribute containing the string “red”:
72> xmerl_xpath:string("//myelement[@myattribute='red']“,
ParsedDocumentRootElement).
[#xmlElement{
name = myelement,
expanded_name = myelement,
nsinfo = [],
namespace = #xmlNamespace{default = [],nodes = []},
parents = [{foo,1}],
pos = 1,
attributes =
[#xmlAttribute{
name = myattribute,
expanded_name = [],
nsinfo = [],
namespace = [],
parents = [],
pos = 1,
language = [],
value = “red”,
normalized = false}],
content =
[#xmlText{
parents = [{myelement,1},{foo,1}],
pos = 1,
language = [],
value = “x”,
type = text}],
language = [],
xmlbase = “/localhome/tonyg”,
elementdef = undeclared}]
This time, there’s only the one match. Finally, a query that no nodes satisfy:
75> xmerl_xpath:string("//myelement[@myattribute='red' and . = 'y']“,
ParsedDocumentRootElement).
[]
If we had replaced the 'y' with 'x', we’d have retrieved a non-empty nodeset.
A couple of months ago, I improved our erlang SMTP server code.
Mon Oct 15: Support callbacks and more of the spec.
Support multiple forward paths. Support callbacks for verification and delivery. Pass domain as well as mailbox for reverse and forward paths. Cope with improper line termination. Log failures in delivery/verification callbacks.
Wed Oct 17: Split out smtp_util:strip_crlf; be RFC2821-strict about CRLF
The code is available by browsing or through mercurial:
hg clone http://hg.opensource.lshift.net/erlang-smtp/
Joe Armstrong, the inventor of Erlang, paid LShift a visit on Friday. He had kindly agreed to give a short talk to a few of my colleagues. We ended up cramming about twenty people into our meeting room, listening to Joe explain the implications of multicore CPU architectures for programming language design. There were lots of questions from the audience and some interesting discussions, keeping us a all occupied for nearly two hours. Matthew Sackman has posted his thoughts on some of the key points.
We covered a range of other topics too, including Joe’s recent idea of an Erlang/OTP Service Pack. This will be on a shorter release cycle than the main Erlang/OTP distribution, allowing changes and new features to be brought to a wide audience more quickly, with the best bits, hopefully, eventually making it into OTP.
Sam Ruby examines support for astral-plane characters in various JSON implementations. His post prompted me to check my Erlang implementation of rfc4627. I found that for astral plane characters in utf-8, utf-16, or utf-32, everything worked properly, but the RFC4627-mandated surrogate-pair “\uXXXX” encodings broke. A few minutes hacking later, and:
Eshell V5.5.5 (abort with ^G)
1> {ok, Utf8Encoded, []} =
rfc4627:decode(”\”\\u007a\\u6c34\\ud834\\udd1e\”").
{ok,<<122,230,176,180,240,157,132,158>>,[]}
2> xmerl_ucs:from_utf8(Utf8Encoded).
[122,27700,119070]
3> rfc4627:encode(Utf8Encoded).
[34,122,230,176,180,240,157,132,158,34]
4>
Much better.
You can get the updated code using mercurial:
hg clone http://hg.opensource.lshift.net/erlang-rfc4627/
In a previous post I explored some of the options for supporting RFC4627 (JSON) Unicode-in-strings well when mapping to Erlang terms. In the end, I settled on keeping the interface almost unchanged: the only change is that binaries returned from rfc4627:decode are to be interpreted as UTF-8 encoded text now, whereas before their interpretation was less well defined.
The new module is Erlang-RFC4627 version 1.1.0, and is available as a tarball, a debian package, or by browsing online here. You can also get the code using mercurial:
hg clone http://hg.opensource.lshift.net/erlang-rfc4627/
Here are some examples using the new module. First, let’s explore the autodetection of which encoding is being used. In the following example, we see UTF-16, both big- and little-endian, as well as ill-formed and well-formed examples of UTF-8 being passed through the autodetector. (It also supports UTF-32 big- and little-endian.)
Eshell V5.5.5 (abort with ^G)
1> rfc4627:unicode_decode([34,0,228,0,34,0]).
{’utf-16le’,”\”ä\”"}
2> rfc4627:unicode_decode([0,34,0,228,0,34]).
{’utf-16be’,”\”ä\”"}
3> rfc4627:unicode_decode([34,228,34]).
** exited: {ucs,{bad_utf8_character_code}} **
4> rfc4627:unicode_decode([34,195,164,34]).
{’utf-8′,”\”ä\”"}
5>
Now let’s look at decoding some UTF-8 encoded JSON text into Erlang terms, and vice versa.
5> rfc4627:decode([34,194,128,34]).
{ok,<<194,128>>,[]}
6> rfc4627:encode(<<194,128>>).
[34,194,128,34]
7> rfc4627:encode_noauto(<<194,128>>).
[34,128,34]
8> rfc4627:unicode_encode({’utf-32le’,
rfc4627:encode_noauto(<<194,128>>)}).
[34,0,0,0,128,0,0,0,34,0,0,0]
9> rfc4627:encode_noauto({obj, [{[27700], 123}]}).
[123,34,27700,34,58,49,50,51,125]
10> rfc4627:encode({obj, [{[27700], 123}]}).
“{\”æ°´\”:123}”
11>
Notice, on that final example, that Erlang is printing the final UTF-8 encoded JSON text as if it were Latin-1. This is nothing to worry about: the numbers in the returned list/string are the correct UTF-8 encoding for Unicode code point 27700.
You are currently browsing the archives for the Erlang category.