Posts filed under 'Programming'

Erlang SMTP code updated

A couple of months ago, I improved our erlang SMTP server code.

  • Mon Oct 15: Support callbacks and more of the spec.

    Support multiple forward paths. Support callbacks for verification and delivery. Pass domain as well as mailbox for reverse and forward paths. Cope with improper line termination. Log failures in delivery/verification callbacks.

  • Wed Oct 17: Split out smtp_util:strip_crlf; be RFC2821-strict about CRLF

The code is available by browsing or through darcs:

darcs get http://www.lshift.net/~tonyg/erlang-smtp/

1 comment December 28th, 2007 tonyg

Joe Armstrong on multicore

Joe Armstrong, the inventor of Erlang, paid LShift a visit on Friday. He had kindly agreed to give a short talk to a few of my colleagues. We ended up cramming about twenty people into our meeting room, listening to Joe explain the implications of multicore CPU architectures for programming language design. There were lots of questions from the audience and some interesting discussions, keeping us a all occupied for nearly two hours. Matthew Sackman has posted his thoughts on some of the key points.

We covered a range of other topics too, including Joe’s recent idea of an Erlang/OTP Service Pack. This will be on a shorter release cycle than the main Erlang/OTP distribution, allowing changes and new features to be brought to a wide audience more quickly, with the best bits, hopefully, eventually making it into OTP.

Add comment November 17th, 2007 matthias

Astral Plane characters in Erlang JSON/RFC4627 implementation

Sam Ruby examines support for astral-plane characters in various JSON implementations. His post prompted me to check my Erlang implementation of rfc4627. I found that for astral plane characters in utf-8, utf-16, or utf-32, everything worked properly, but the RFC4627-mandated surrogate-pair “\uXXXX” encodings broke. A few minutes hacking later, and:

Eshell V5.5.5  (abort with ^G)
1> {ok, Utf8Encoded, []} =
        rfc4627:decode("\"\\u007a\\u6c34\\ud834\\udd1e\"").
{ok,<<122,230,176,180,240,157,132,158>>,[]}
2> xmerl_ucs:from_utf8(Utf8Encoded).
[122,27700,119070]
3> rfc4627:encode(Utf8Encoded).
[34,122,230,176,180,240,157,132,158,34]
4> 

Much better.

You can get the updated code using mercurial:

hg clone http://hg.opensource.lshift.net/erlang-rfc4627/

Add comment November 16th, 2007 tonyg

Your very own 32-way SIMD machine

What’s a good way of counting the number of bits set in a word? The obvious answer, adding the low bit to an accumulator, shifting right, and repeating, is O(n) in the number of bits in the word. This is a sequential approach - and we can do better, complexity-wise, by using a parallel algorithm. Let’s assume we are using 32-bit words, and that Xn is just such a 32-bit word:

X0 = input word
X1 = (X0 & 0x55555555) + ((X0 >>  1) & 0x55555555)
X2 = (X1 & 0x33333333) + ((X1 >>  2) & 0x33333333)
X3 = (X2 & 0x0F0F0F0F) + ((X2 >>  4) & 0x0F0F0F0F)
X4 = (X3 & 0x00FF00FF) + ((X3 >>  8) & 0x00FF00FF)
X5 = (X4 & 0x0000FFFF) + ((X4 >> 16) & 0x0000FFFF)
total number of set bits = X5

This algorithm is O(log2 n) in the number of bits in a word.

Every ordinary N-bit-word based sequential machine is a disguised N-way, 1-bit SIMD machine with a slightly odd instruction set. Lots more on data-parallel algorithms here.

What about finding which is the highest bit set in a word?

X0 = input word
X1 = X0 or (X0 >> 1)
X2 = X1 or (X1 >> 2)
X3 = X2 or (X2 >> 4)
X4 = X3 or (X3 >> 8)
X5 = X4 or (X4 >> 16)

… and feed X5 through the parallel counter-of-set-bits algorithm above. The resulting number is the index of the highest set bit in the original word, starting from zero.

7 comments October 15th, 2007 tonyg

JONESFORTH ported to PowerPC and Mac OS X

A couple of weeks ago, Richard W. M. Jones released JONESFORTH, which I thought was pretty exciting. Today I spent a few hours porting the assembly-language part to PowerPC on Mac OS X 10.3.9. It ended up being 600 non-comment lines of code in total, and took me about eleven hours in total to write and debug. It runs the standard JONESFORTH prelude, up to and including SEE.

You can download the code here: ppcforth.S.m4.

(It’s also available via darcs: darcs get http://www.eighty-twenty.org/~tonyg/Darcs/jonesforth.)

The assembler-macro tricks that the original i386 version uses are sadly unavailable with the default OS X assembler, so I’ve had to resort to using m4 instead; other than that, it’s more-or-less a direct translation of Richard’s original program. To compile it,

m4 ppcforth.S.m4 > ppcforth.S
gcc -nostdlib -o ppcforth ppcforth.S
rm ppcforth.S

To run it, download the JONESFORTH prelude (save it as jonesforth.f), and

$ cat jonesforth.f - | ./ppcforth 
JONESFORTH VERSION 14641 
OK 

Here’s an example session, decompiling the “ELSE” word:

SEE ELSE
: ELSE IMMEDIATE ‘ BRANCH , HERE @ 0 , SWAP DUP HERE @ SWAP - SWAP ! ;

I’d like to thank Richard for such an amazingly well-written program: not only is JONESFORTH itself a beautiful piece of software, it’s also an incredibly lucid essay that does a wonderful job of introducing the reader to the concepts and techniques involved in implementing a FORTH.

2 comments October 4th, 2007 tonyg

Proper Unicode support in Erlang RFC4627 (JSON) module

In a previous post I explored some of the options for supporting RFC4627 (JSON) Unicode-in-strings well when mapping to Erlang terms. In the end, I settled on keeping the interface almost unchanged: the only change is that binaries returned from rfc4627:decode are to be interpreted as UTF-8 encoded text now, whereas before their interpretation was less well defined.

The new module is Erlang-RFC4627 version 1.1.0, and is available as a tarball, a debian package, or by browsing online here. You can also get the code using mercurial:

hg clone http://hg.opensource.lshift.net/erlang-rfc4627/

Here are some examples using the new module. First, let’s explore the autodetection of which encoding is being used. In the following example, we see UTF-16, both big- and little-endian, as well as ill-formed and well-formed examples of UTF-8 being passed through the autodetector. (It also supports UTF-32 big- and little-endian.)

Eshell V5.5.5  (abort with ^G)
1> rfc4627:unicode_decode([34,0,228,0,34,0]).
{'utf-16le',"\"ä\""}
2> rfc4627:unicode_decode([0,34,0,228,0,34]).
{'utf-16be',"\"ä\""}
3> rfc4627:unicode_decode([34,228,34]).
** exited: {ucs,{bad_utf8_character_code}} **
4> rfc4627:unicode_decode([34,195,164,34]).
{'utf-8',"\"ä\""}
5> 

Now let’s look at decoding some UTF-8 encoded JSON text into Erlang terms, and vice versa.

5> rfc4627:decode([34,194,128,34]).
{ok,<<194,128>>,[]}
6> rfc4627:encode(<<194,128>>).
[34,194,128,34]
7> rfc4627:encode_noauto(<<194,128>>).
[34,128,34]
8> rfc4627:unicode_encode({’utf-32le’,
        rfc4627:encode_noauto(<<194,128>>)}).
[34,0,0,0,128,0,0,0,34,0,0,0]
9> rfc4627:encode_noauto({obj, [{[27700], 123}]}).
[123,34,27700,34,58,49,50,51,125]
10> rfc4627:encode({obj, [{[27700], 123}]}).
“{\”æ°´\”:123}”
11> 

Notice, on that final example, that Erlang is printing the final UTF-8 encoded JSON text as if it were Latin-1. This is nothing to worry about: the numbers in the returned list/string are the correct UTF-8 encoding for Unicode code point 27700.

2 comments October 3rd, 2007 tonyg

Too much mail is bad for you

We received a few reports from users of our Erlang-based RabbitMQ message broker who saw sharp decreases in throughput performance when putting the broker under heavy load. We subsequently reproduced these results in our lab. This is not what we expected to see - while some performance degradation is inevitable when running a system at its limits, we had carefully designed RabbitMQ to make such degradation are small and gradual. So clearly the system was behaving in ways we had not anticipated.

We eventually tracked down the problem. The lesson is: if you make synchronous calls inside an Erlang process you’d better make sure its message queue is short.

Continue Reading 8 comments October 1st, 2007 matthias

Minimal Erlang SMTP, POP3 server code

Some seven months ago, I built simple Erlang modules for generic SMTP and POP3 services. The idea is that the programmer should instantiate a service, providing callbacks for user authentication and for service-specific operations like handling deliveries, and scanning and locking mailboxes. Originally, I was planning on providing SMTP-to-AMQP and AMQP-to-POP3 gateways as part of RabbitMQ, but I haven’t had the time to seriously pursue this yet.

A snapshot of the code is available as a zip file, or you can browse the code online or retrieve it using darcs:

darcs get http://www.lshift.net/~tonyg/erlang-smtp/

The current status of the code is:

  • SMTP deliveries from Thunderbird work
  • POP3 retrieval from Thunderbird works, but isn’t very solid, because I haven’t implemented the stateful part of mailboxes yet.
  • The SMTP implementation is somewhat loosely based on RFC 2821. It’s what you might generously call minimally conformant (improving this situation is tedious but not difficult). It doesn’t address RFC 2822 in any serious way (yet)
  • The POP3 implementation is based on RFC 1939.
  • SMTP AUTH is not yet implemented (but is not difficult)
  • I can’t recall the details (seven months!), but I think I might have skimped on something relating to POP3 UIDL.
  • Neither module has pluggable callbacks: the SMTP delivery-handler is currently io:format, and the POP3 mailbox and user authentication database are similarly hard-coded.

Patches, bugfixes, contributions, comments and feedback are all very welcome!

Update: a new post summarises changes since this post, including pluggable callbacks etc.

Add comment September 20th, 2007 tonyg

Erlang on Neo1973 cellphone

This evening, after fighting bitbake (in the form of the capricious “insane.bbclass” class definition) for a good few hours, I managed to get Erlang version R11B-5 running on my new cellphone.

Running the interactive erlang shell on a cellphone is pretty cool. Erlang’s built-in clustering support works fine: I’ve successfully connected an erlang node on my pc to a node on the phone using the USB ethernet support the phone provides.

The base package compiles down to a bit less than 7MB, which is a bit large. The full suite of libraries are another 22MB or so. It’s certainly possible to fine-tune the packaging process to get a smaller distribution, but for now I’m happy developing against what I have.

Update: I’ve posted my changed build scripts to OpenEmbedded’s bug tracker at bug 3014. Here’s a direct link to the tarball, if anyone would like to try it themselves.

4 comments September 16th, 2007 tonyg

Most exciting programming language I’ve seen in months

This (LTU discussion here) is the most exciting programming language implementation I’ve seen in months. Time to learn more about Forth!

Add comment September 14th, 2007 tonyg

Next Posts Previous Posts

Calendar

September 2008
M T W T F S S
« Jul    
1234567
891011121314
15161718192021
22232425262728
2930  

Posts by Month

Posts by Category