Posts filed under 'Programming'
Last weekend I finally revisited the diff-in-javascript code I’d written a couple of years back, adding (very simple) patch-like and diff3-like functionality.
On the way, not only did I discover Khanna, Kunal and Pierce’s excellent paper “A Formal Investigation of Diff3“, but I found revctrl.org, the revision-control wiki, which I’m just starting to get my teeth into. I’m looking forward to learning more about merge algorithms.
The code I wrote last weekend is available: just download diff.js. The tools included:
Diff.diff_comm - works like a simple Unix comm(1)
Diff.diff_patch - works like a simple Unix diff(1)
Diff.patch - works like a (very) simple Unix patch(1) (it’s not a patch on Wall’s patch)
Diff.diff3_merge - works like a couple of the variations on GNU’s diff3(1)
Read on for some examples showing the library in action.
Continue Reading May 9th, 2008
tonyg
When LShift first started off in 2000, the only real option for mature, open source version control was CVS. We’ve used CVS for most of our projects since then, and gone on to develop a strong infrastructure for managing CVS-backed projects, including a web interface for viewing versions, a web-based searchable database for related CVS commits (”CVSzilla”) which infers transactions from multiple simultaneous commits, and integration with the Bugzilla bug tracker.
Today, there are many other options, and I’ll discuss six major alternatives here: Subversion, Monotone, darcs, Git, Bazaar, and Mercurial.
Continue Reading April 24th, 2008
Paul Crowley
Barry Pederson’s excellent py-amqplib Python AMQP client comes with a very cute little demo, demonstrating how easy it is to do sophisticated cron-like things with AMQP and RabbitMQ.
As Barry writes in the example, the trick is to “[fire] off simple messages at one-minute intervals to a topic exchange named ‘clock’, with the topic of the message being the local time as ‘year.month.date.dow.hour.minute‘, for example: ‘2007.11.26.1.12.33′, where the dow (day of week) is 0 for Sunday, 1 for Monday, and so on (similar to Unix crontab). A consumer could then bind a queue to the routing key ‘#.0′ for example to get a message at the beginning of each hour.”
Lovely!
February 8th, 2008
tonyg
We’ve been investigating the possibility of an XPath-based routing extension to RabbitMQ, where XPath would be used as binding patterns, and the message structure would be exposed as XML infoset. As part of this work, we’ve been looking at Erlang’s XPath implementation that comes as part of the built-in xmerl library.
This post walks through a couple of simple examples of using Erlang’s XPath implementation to retrieve nodesets matching various criteria.
Continue Reading January 31st, 2008
tonyg
I’ve taken Mike Anderson’s Cairo clock applet demo and updated it ever so slightly for GNU Smalltalk version 2.95h.
I’ve also written a small patch, available here, intended to apply atop fixes-2.95h.patch, which gets the code into a position where it will run without change on both my stock Debian x86 desktop as well as my OS X 10.3.9 box with a prehistoric Fink 0.7.2 and a hand-compiled libcairo.
January 1st, 2008
tonyg
A couple of months ago, I improved our erlang SMTP server code.
Mon Oct 15: Support callbacks and more of the spec.
Support multiple forward paths. Support callbacks for verification and
delivery. Pass domain as well as mailbox for reverse and forward
paths. Cope with improper line termination. Log failures in delivery/verification
callbacks.
Wed Oct 17: Split out smtp_util:strip_crlf; be RFC2821-strict about CRLF
The code is available by browsing or through darcs:
darcs get http://www.lshift.net/~tonyg/erlang-smtp/
December 28th, 2007
tonyg
Joe Armstrong, the inventor of Erlang, paid LShift a visit on Friday. He had kindly agreed to give a short talk to a few of my colleagues. We ended up cramming about twenty people into our meeting room, listening to Joe explain the implications of multicore CPU architectures for programming language design. There were lots of questions from the audience and some interesting discussions, keeping us a all occupied for nearly two hours. Matthew Sackman has posted his thoughts on some of the key points.
We covered a range of other topics too, including Joe’s recent idea of an Erlang/OTP Service Pack. This will be on a shorter release cycle than the main Erlang/OTP distribution, allowing changes and new features to be brought to a wide audience more quickly, with the best bits, hopefully, eventually making it into OTP.
November 17th, 2007
matthias
Sam Ruby examines support for astral-plane characters in various JSON implementations. His post prompted me to check my Erlang implementation of rfc4627. I found that for astral plane characters in utf-8, utf-16, or utf-32, everything worked properly, but the RFC4627-mandated surrogate-pair “\uXXXX” encodings broke. A few minutes hacking later, and:
Eshell V5.5.5 (abort with ^G)
1> {ok, Utf8Encoded, []} =
rfc4627:decode("\"\\u007a\\u6c34\\ud834\\udd1e\"").
{ok,<<122,230,176,180,240,157,132,158>>,[]}
2> xmerl_ucs:from_utf8(Utf8Encoded).
[122,27700,119070]
3> rfc4627:encode(Utf8Encoded).
[34,122,230,176,180,240,157,132,158,34]
4>
Much better.
You can get the updated code using darcs:
darcs get http://www.lshift.net/~tonyg/erlang-rfc4627/
November 16th, 2007
tonyg
What’s a good way of counting the number of bits set in a word? The obvious answer, adding the low bit to an accumulator, shifting right, and repeating, is O(n) in the number of bits in the word. This is a sequential approach - and we can do better, complexity-wise, by using a parallel algorithm. Let’s assume we are using 32-bit words, and that Xn is just such a 32-bit word:
X0 = input word
X1 = (X0 & 0x55555555) + ((X0 >> 1) & 0x55555555)
X2 = (X1 & 0x33333333) + ((X1 >> 2) & 0x33333333)
X3 = (X2 & 0x0F0F0F0F) + ((X2 >> 4) & 0x0F0F0F0F)
X4 = (X3 & 0x00FF00FF) + ((X3 >> 8) & 0x00FF00FF)
X5 = (X4 & 0x0000FFFF) + ((X4 >> 16) & 0x0000FFFF)
total number of set bits = X5
This algorithm is O(log2 n) in the number of bits in a word.
Every ordinary N-bit-word based sequential machine is a disguised N-way, 1-bit SIMD machine with a slightly odd instruction set. Lots more on data-parallel algorithms here.
What about finding which is the highest bit set in a word?
X0 = input word
X1 = X0 or (X0 >> 1)
X2 = X1 or (X1 >> 2)
X3 = X2 or (X2 >> 4)
X4 = X3 or (X3 >> 8)
X5 = X4 or (X4 >> 16)
… and feed X5 through the parallel counter-of-set-bits algorithm above. The resulting number is the index of the highest set bit in the original word, starting from zero.
October 15th, 2007
tonyg
A couple of weeks ago, Richard W. M. Jones released JONESFORTH, which I thought was pretty exciting. Today I spent a few hours porting the assembly-language part to PowerPC on Mac OS X 10.3.9. It ended up being 600 non-comment lines of code in total, and took me about eleven hours in total to write and debug. It runs the standard JONESFORTH prelude, up to and including SEE.
You can download the code here: ppcforth.S.m4.
(It’s also available via darcs: darcs get http://www.eighty-twenty.org/~tonyg/Darcs/jonesforth.)
The assembler-macro tricks that the original i386 version uses are sadly unavailable with the default OS X assembler, so I’ve had to resort to using m4 instead; other than that, it’s more-or-less a direct translation of Richard’s original program. To compile it,
m4 ppcforth.S.m4 > ppcforth.S
gcc -nostdlib -o ppcforth ppcforth.S
rm ppcforth.S
To run it, download the JONESFORTH prelude (save it as jonesforth.f), and
$ cat jonesforth.f - | ./ppcforth
JONESFORTH VERSION 14641
OK
Here’s an example session, decompiling the “ELSE” word:
SEE ELSE
: ELSE IMMEDIATE ‘ BRANCH , HERE @ 0 , SWAP DUP HERE @ SWAP - SWAP ! ;
I’d like to thank Richard for such an amazingly well-written program: not only is JONESFORTH itself a beautiful piece of software, it’s also an incredibly lucid essay that does a wonderful job of introducing the reader to the concepts and techniques involved in implementing a FORTH.
October 4th, 2007
tonyg
Previous Posts