Posts filed under 'Erlang'

Some simple examples of using Erlang’s XPath implementation

We’ve been investigating the possibility of an XPath-based routing extension to RabbitMQ, where XPath would be used as binding patterns, and the message structure would be exposed as XML infoset. As part of this work, we’ve been looking at Erlang’s XPath implementation that comes as part of the built-in xmerl library.

This post walks through a couple of simple examples of using Erlang’s XPath implementation to retrieve nodesets matching various criteria.

Continue Reading 4 comments January 31st, 2008 tonyg

Erlang SMTP code updated

A couple of months ago, I improved our erlang SMTP server code.

  • Mon Oct 15: Support callbacks and more of the spec.

    Support multiple forward paths. Support callbacks for verification and delivery. Pass domain as well as mailbox for reverse and forward paths. Cope with improper line termination. Log failures in delivery/verification callbacks.

  • Wed Oct 17: Split out smtp_util:strip_crlf; be RFC2821-strict about CRLF

The code is available by browsing or through darcs:

darcs get http://www.lshift.net/~tonyg/erlang-smtp/

1 comment December 28th, 2007 tonyg

Joe Armstrong on multicore

Joe Armstrong, the inventor of Erlang, paid LShift a visit on Friday. He had kindly agreed to give a short talk to a few of my colleagues. We ended up cramming about twenty people into our meeting room, listening to Joe explain the implications of multicore CPU architectures for programming language design. There were lots of questions from the audience and some interesting discussions, keeping us a all occupied for nearly two hours. Matthew Sackman has posted his thoughts on some of the key points.

We covered a range of other topics too, including Joe’s recent idea of an Erlang/OTP Service Pack. This will be on a shorter release cycle than the main Erlang/OTP distribution, allowing changes and new features to be brought to a wide audience more quickly, with the best bits, hopefully, eventually making it into OTP.

Add comment November 17th, 2007 matthias

Astral Plane characters in Erlang JSON/RFC4627 implementation

Sam Ruby examines support for astral-plane characters in various JSON implementations. His post prompted me to check my Erlang implementation of rfc4627. I found that for astral plane characters in utf-8, utf-16, or utf-32, everything worked properly, but the RFC4627-mandated surrogate-pair “\uXXXX” encodings broke. A few minutes hacking later, and:

Eshell V5.5.5  (abort with ^G)
1> {ok, Utf8Encoded, []} =
        rfc4627:decode("\"\\u007a\\u6c34\\ud834\\udd1e\"").
{ok,<<122,230,176,180,240,157,132,158>>,[]}
2> xmerl_ucs:from_utf8(Utf8Encoded).
[122,27700,119070]
3> rfc4627:encode(Utf8Encoded).
[34,122,230,176,180,240,157,132,158,34]
4> 

Much better.

You can get the updated code using darcs:

darcs get http://www.lshift.net/~tonyg/erlang-rfc4627/

Add comment November 16th, 2007 tonyg

Proper Unicode support in Erlang RFC4627 (JSON) module

In a previous post I explored some of the options for supporting RFC4627 (JSON) Unicode-in-strings well when mapping to Erlang terms. In the end, I settled on keeping the interface almost unchanged: the only change is that binaries returned from rfc4627:decode are to be interpreted as UTF-8 encoded text now, whereas before their interpretation was less well defined.

The new module is Erlang-RFC4627 version 1.1.0, and is available as a tarball, a debian package, or by browsing online here. You can also get the code using darcs:

darcs get http://www.lshift.net/~tonyg/erlang-rfc4627/

Here are some examples using the new module. First, let’s explore the autodetection of which encoding is being used. In the following example, we see UTF-16, both big- and little-endian, as well as ill-formed and well-formed examples of UTF-8 being passed through the autodetector. (It also supports UTF-32 big- and little-endian.)

Eshell V5.5.5  (abort with ^G)
1> rfc4627:unicode_decode([34,0,228,0,34,0]).
{'utf-16le',"\"ä\""}
2> rfc4627:unicode_decode([0,34,0,228,0,34]).
{'utf-16be',"\"ä\""}
3> rfc4627:unicode_decode([34,228,34]).
** exited: {ucs,{bad_utf8_character_code}} **
4> rfc4627:unicode_decode([34,195,164,34]).
{'utf-8',"\"ä\""}
5> 

Now let’s look at decoding some UTF-8 encoded JSON text into Erlang terms, and vice versa.

5> rfc4627:decode([34,194,128,34]).
{ok,<<194,128>>,[]}
6> rfc4627:encode(<<194,128>>).
[34,194,128,34]
7> rfc4627:encode_noauto(<<194,128>>).
[34,128,34]
8> rfc4627:unicode_encode({’utf-32le’,
        rfc4627:encode_noauto(<<194,128>>)}).
[34,0,0,0,128,0,0,0,34,0,0,0]
9> rfc4627:encode_noauto({obj, [{[27700], 123}]}).
[123,34,27700,34,58,49,50,51,125]
10> rfc4627:encode({obj, [{[27700], 123}]}).
“{\”æ°´\”:123}”
11> 

Notice, on that final example, that Erlang is printing the final UTF-8 encoded JSON text as if it were Latin-1. This is nothing to worry about: the numbers in the returned list/string are the correct UTF-8 encoding for Unicode code point 27700.

2 comments October 3rd, 2007 tonyg

Too much mail is bad for you

We received a few reports from users of our Erlang-based RabbitMQ message broker who saw sharp decreases in throughput performance when putting the broker under heavy load. We subsequently reproduced these results in our lab. This is not what we expected to see - while some performance degradation is inevitable when running a system at its limits, we had carefully designed RabbitMQ to make such degradation are small and gradual. So clearly the system was behaving in ways we had not anticipated.

We eventually tracked down the problem. The lesson is: if you make synchronous calls inside an Erlang process you’d better make sure its message queue is short.

Continue Reading 8 comments October 1st, 2007 matthias

Minimal Erlang SMTP, POP3 server code

Some seven months ago, I built simple Erlang modules for generic SMTP and POP3 services. The idea is that the programmer should instantiate a service, providing callbacks for user authentication and for service-specific operations like handling deliveries, and scanning and locking mailboxes. Originally, I was planning on providing SMTP-to-AMQP and AMQP-to-POP3 gateways as part of RabbitMQ, but I haven’t had the time to seriously pursue this yet.

A snapshot of the code is available as a zip file, or you can browse the code online or retrieve it using darcs:

darcs get http://www.lshift.net/~tonyg/erlang-smtp/

The current status of the code is:

  • SMTP deliveries from Thunderbird work
  • POP3 retrieval from Thunderbird works, but isn’t very solid, because I haven’t implemented the stateful part of mailboxes yet.
  • The SMTP implementation is somewhat loosely based on RFC 2821. It’s what you might generously call minimally conformant (improving this situation is tedious but not difficult). It doesn’t address RFC 2822 in any serious way (yet)
  • The POP3 implementation is based on RFC 1939.
  • SMTP AUTH is not yet implemented (but is not difficult)
  • I can’t recall the details (seven months!), but I think I might have skimped on something relating to POP3 UIDL.
  • Neither module has pluggable callbacks: the SMTP delivery-handler is currently io:format, and the POP3 mailbox and user authentication database are similarly hard-coded.

Patches, bugfixes, contributions, comments and feedback are all very welcome!

Update: a new post summarises changes since this post, including pluggable callbacks etc.

Add comment September 20th, 2007 tonyg

Erlang on Neo1973 cellphone

This evening, after fighting bitbake (in the form of the capricious “insane.bbclass” class definition) for a good few hours, I managed to get Erlang version R11B-5 running on my new cellphone.

Running the interactive erlang shell on a cellphone is pretty cool. Erlang’s built-in clustering support works fine: I’ve successfully connected an erlang node on my pc to a node on the phone using the USB ethernet support the phone provides.

The base package compiles down to a bit less than 7MB, which is a bit large. The full suite of libraries are another 22MB or so. It’s certainly possible to fine-tune the packaging process to get a smaller distribution, but for now I’m happy developing against what I have.

Update: I’ve posted my changed build scripts to OpenEmbedded’s bug tracker at bug 3014. Here’s a direct link to the tarball, if anyone would like to try it themselves.

4 comments September 16th, 2007 tonyg

How should JSON strings be represented in Erlang?

Erlang represents strings as lists of (ASCII, or possibly iso8859-1) codepoints. In this regard, it’s weakly typed - there’s no hard distinction between a string, “ABC”, and a list of small integers, [65,66,67]. For example:

Eshell V5.5.4  (abort with ^G)
1> "ABC".
"ABC"
2> [65,66,67].
"ABC"
3> 

Erlang also has a binary type, a simple vector of bytes. In the rfc4627/JSON codec I made for Erlang, I chose to use binaries to represent decoded strings, as suggested by Joe Armstrong.

All was well - until I came to implement UTF8 support after Sam Ruby got the ball rolling. Binaries will no longer work as the chosen mapping for JSON strings, since strings may contain arbitrary characters, including those with codepoints greater than 255.

It has always been the case that the ideal representation for a JSON string is an Erlang string, a list of codepoints. Binaries are really a bit of a compromise. But choosing strings-for-strings puts us straight back in a weakly-typed position: it’s possible in JSON to distinguish between “ABC” and [65,66,67], but it’s not possible to make the same distinction in Erlang. We’d need to alter the way JSON arrays are represented to compensate.

Possible solutions:

  • Map strings to lists of codepoints. Map arrays to tuples rather than lists. Objects remain {obj,[…]}.
    • Pros: Terse syntax for strings and arrays, no worse than the Unicode-ignorant mapping
    • Cons: Awkward recursion over arrays, either using a counter and the element/2 BIF, or converting to a real list

  • Map strings to binaries containing UTF-8 encoded characters. Keep arrays as lists. Objects remain {obj,[…]}.

    • Pros: Keep terse syntax for strings, with the understanding that the binaries concerned must hold UTF8-encoded text. Keeps the interface largely unchanged.
    • Cons: Codec needs to perform possibly-redundant Unicode encoding/decoding steps to ensure that the binaries hold UTF8 even if, say, UTF32 were the format to be used on the wire

  • Map strings to lists of codepoints. Map arrays to {arr,[…]}, as other JSON codecs do. Objects remain {obj,[…]}.

    • Pros: Natural operations on strings, natural operations on arrays (once you strip the outer {arr,…}).
    • Cons: Converting terms to JSON-encodable form is a pain, since you need to wrap each array in your term with the explicit marker atom.

All in all, I can’t decide which is the least distasteful option. I think I prefer the middle option, keeping strings mapped to binaries and viewing them as UTF-8 encoded text, but I really need to get some feedback on the issue.

8 comments September 13th, 2007 tonyg

Invitation to AMQP and RabbitMQ Birds of a Feather session

I am guest blogging here on behalf of CohesiveFT. We work with the excellent LShift team on our joint venture, RabbitMQ.

I’m here to invite you to a Birds of a Feather session this coming Thursday, August 30th, at 8pm, in central London. It is FREE and will last for 45 minutes starting at 8pm, followed by the traditional breakout discussions over a beer. Please do take a look at RabbitMQ if you have not yet done so. It’s a commercial open source product, available under the MPL 1.1 and implementing the Advanced Message Queue Protocol. AMQP is a new way to do business messaging (ie: “what goes in, must come out“). What’s really cool is that like HTTP it is a protocol instead of a language specific API. This should make interoperability between platforms much easier and less painful (business readers: “systems integration projects take less time and success can be predicted more accurately”). For more information, please see my list of links here.

What is the BOF about - and why come? It’s an informal session about RabbitMQ and AMQP, and how they apply within popular environments such as Spring, Mule, Ruby, AJAX, and other messaging protocols such as FIX.

“Informal” means we’ll be encouraging a conversation between people interested in any of these things. We want to hear from you, and from each other, rather than pushing slideware at people.

Come if you want to:

You can find out details of the BOF here. Ideally we ask you to register via the web site, but late arrivals are very welcome - if you turn up, we shall get you in. The BOF is offered as part of the popular EJUG series of tech talks and as a tie-in with the most excellent No Fluff Just Stuff conference.

If you cannot come but want to know more about any of these things then you can email us at info@rabbitmq.com.

Thank-you very much - and we hope to see you on Thursday :-)

Posted by Chris on behalf of Alexis Richardson, CohesiveFT.

2 comments August 28th, 2007 chris

Previous Posts

Calendar

May 2008
M T W T F S S
« Apr    
 1234
567891011
12131415161718
19202122232425
262728293031  

Posts by Month

Posts by Category