technology from back to front

Rant: The unreasoned Javan

I really hate null!

Reflect on that statement. Apparently Tim has a strong dislike for a concept found in lots of programming languages (even brainiac languages like Haskell) and successfully used in millions of programs. He must be crazy I wouldn’t like to have a discussion with him about something contentious like tabs versus spaces.

(more…)

by
tim
on
17/01/12

Rant: King Kong! Misadventures in Ruby meta-programming

Sometimes after a particularly fraught bug stomping session you make a frivolous offhand remark to a colleague, for example “I will write a macro that converts lisp definitions in prefix form so that arithmetic looks like how it was taught to you in school” or “I won’t let my unit test really be an integration test ever again”. Today after wrestling with Ruby I mentioned a mild dislike for monkey patching as often used in Ruby and the idea for King Kong was born.

If you don’t like monkey patching in Ruby, then King Kong is the solution. He is a very large primate who likes to have his own way. If you define a class and then monkey patch it King Kong will eat it, it will be gone, no longer available for your use. Fortunately King Kong can’t use source code control systems (yet!) so you can correct your error and ponder his subtle ways.

We shall start by writing a spec:

require "rspec"
require "king-kong"

class X
    def x()
        "x"
    end
end

describe "King Kong" do

    it "does not allow monkey patching" do
        Object.const_defined?("X").should == true

        class X
            def x()
                "not x"
            end
        end

        Object.const_defined?("X").should == false
    end
end

Now we define King Kong!

class Class
    @@method_history = {}

    def method_added(method_name)
        @@method_history[self] ||= {}
        if @@method_history[self][method_name] then
            puts “King kong is the top primate\nHe has eaten your class!”
            Object.send(:remove_const, self.name)
        else
                @@method_history[self][method_name] = true
        end
    end
end

Run the spec, spec passes, job done! This code is licensed as is and probably shouldn’t be run in production, or even your QA environment but you are welcome to run it on your development machine for minimal comedic effect.

King Kong is a fine example of Ruby meta-programming and an excellent addition to your arsenal of offensive programming tools taking a large amount of inspiration from tools like Guantanamo. Marvel at how easy it is to completely annihilate a class from existence in Ruby, if that was Java you would have to give a class loader or something a walloping with a GreasySpannerFactory to achieve the same thing.

Remember sometimes monkey patching is a solution and some times it is the cause of the problem!

If you want to take a more serious approach to this problem you might want to look here.

by
tim
on
16/11/11

Rant: Mustache for your mail merge

At LShift we like to program on blackboards using untyped lambda calculus, and we enter code into a computer only once we have a truly generic solution to a problem. However, most of the time we need to earn money so that we can eat and wear clothes other than LShift t-shirts - this usually involves compromising our principled approach and using "real" programming languages and libraries.

This can be quite painful! In my experience all real world projects need to produce output for a customer and this usually uses a technology called "mail merge" but as we are technical people we use the more technical name of "templating" (If you are elderly and not a computer scientist it is best to think of it as mail merge, this will have been an activity you read about in PCW magazines when you were a spotty teenager and hoped to never carry out and now looking back over your illustrious career you have realised that you have spent most of your time performing mail merge poorly). Templating sucks!

As we are rapidly moving into Movember I will briefly describe templating with mustache (this is how colonial people spell moustache, but searching for moustache on github will produce an excellent Clojure library not a templating system). I propose that if you are programming in Ruby, Mustache sucks less than ERB and will support my proposal with some examples.

As I said previously templating sucks! It especially sucks if you care about the formatting of your output which is why I call templating mail merge. If you received a printed letter from your electricity supplier and it went,

Dear    Valued 
     Customer, 
Your request for
 a further three phase supply to your premises
is denied
  as we fear that you have upset the villagers with your reanimation experiments.

you would probably be wondering about the extra spaces and carriage return in the text. Most of the time in HTML you wouldn’t notice because browsers eat whitespace, however if you were templating printed material or code it can be quite painful.

Lets look at an example of this using ERB. We will define a simple Ruby class to hold sections that have a name and may contain subsections.

class Section
  attr_reader :name, :subsections
  def initialize(name, subsections = nil)
    @name  = name
    @subsections = subsections
  end
end

Here is the ERB to render the output:

<% s.sections.each do |section| %>
This is section <%= section.name %>
<% if section.subsections %>
  <% section.subsections.each do |subsection| %>
  This is subsection <%= subsection.name %>
  <% end %>
<% end %>
<% end %>

Which I render like this:

template1 = ERB.new(File.read("simple1.erb"), 0, '<>')
puts "======================================"
puts template1.result
puts "======================================"

To produce:

======================================
This is section one
This is section two

  This is subsection two-a

This is section three

======================================

That output doesn’t look brilliant to me! I can feel a bug being raised in the issue tracker already, it really shouldn’t have those empty lines in the output.

If I modify the ERB I can do better:

<% s.sections.each do |section| %>
This is section <%= section.name %>
<% if section.subsections %>
<% section.subsections.each do |subsection| %>
  This is subsection <%= subsection.name %>
<% end %>
<% end %>
<% end %>

Which produces:

======================================
This is section one
This is section two
    This is subsection two-a
This is section three
======================================

Perfect output, but the ERB template looks a bit fragile, three end tags all in a row with no indentation to guide you will probably lead to difficult maintenance. The ERB templates I am currently maintaining are much more complex than the one I have used here and indentation really does help to make them readable.

Now for mustache! The template looks like this:

{{#sections}}
This is section {{name}}
  {{#subsections}}
  This is subsection {{name}}
  {{/subsections}}
{{/sections}}

We have to do a small amount more work to generate the output like this:

class Simple < Mustache
  self.template_path = File.dirname(__FILE__)

  def sections
    Array[
      Section.new("one"),
      Section.new("two", Array[Section.new("two-a")]),
      Section.new(”three”)]
  end
end

s = Simple.new
puts “======================================”
puts s.render
puts “======================================”

The code produces perfect output:

======================================
This is section one
This is section two
    This is subsection two-a
This is section three
======================================

The template also looks more maintainable than the equivalent ERB. However, you can still break the output quite easily with a very small change like this:

{{#sections}}
This is section {{name}}
  {{#subsections}}This is subsection {{name}}{{/subsections}}
{{/sections}}

Which produces this broken output:

======================================
This is section one

This is section two
    This is subsection two-a
This is section three

======================================

So it is best to keep your mustache sections on seperate lines to avoid extraneous whitespace. In summary mustache is probably easier and will produce cleaner output that ERB if you are performing a mail merge. Additionally mustache is cross-platform and has support for a large range of esoteric languages if you are tired of programming in Ruby.

by
tim
on
31/10/11

Rant: “Nearby art”: using the V&A API and geolocation

A little while back, I was informed that the V&A had an API. To be honest, my first response to this was “why on earth?”. There’s been a few similar APIs coming out recently from organisations, with some sort of “build it and they’ll come” expectations i.e. expecting that all they have to do is provide the API and all us developers will automagically build them shiny apps for free. If you’re TfL, then this kinda works, but it’s not so true for a lot of places.

Having had this initial reaction, I still decided to dig through the documentation a bit, and spotted an interesting nugget - they’ll let you do geospatial searches. I’d been tinkering around with the idea of playing with this, especially for use with my shiny new Android phone, and I had an a idea for a little app to show you “nearby art” i.e. search with the V&A’s API for the nearest bit of art.

I did this mostly in Javascript, doing XMLHttpRequest’s for JSON chunks of the API. There’s also a block of Python code that needs to run on a server, but that’s entirely to get around the issues of XMLHttpRequest only allowing same-server requests. It first uses navigator.geoLocation (official spec, easier documentation) to get the user’s location, then does two V&A queries - the first to get a list of local objects, and the second to get more info on the first object.

One thing you have to be careful about is that this can break in various ways. The most obvious is a lack of navigator.geoLocation (any version of IE, and all not-latest versions of most other browsers), and another is if the user denies access to their location data. This does make navigator.geoLocation unsuitable for general use currently, but it’s a useful source of data when there is support.

The full app is over here and the source is here.

by
Tom Parker
on
16/07/10

Rant: Yahoo doesn’t know what an email address is

Many websites refuse to accept email addresses of the form myusername+sometext@gmail.com, despite the fact that the +sometext is perfectly legitimate1 and is an advertised feature gmail offers for creating pseudo-single-use email addresses from a base email address.

My guess is that the developers of these sites think, because they’re either lazy or incompetent, that email addresses have more restrictions than they in fact have. It’s reasonable (and fairly easy) these days to check the syntax of the DNS part of an email address, because few people use non-DNS or non-SMTP transfer methods anymore, but the mailbox part is extremely flexible and hard to check accurately. A sane thing to do is just trust the user, and send a test mail to validate the address.

I picked on Yahoo in the title of this post: Yahoo are by no means the only offender, but I just signed up for a yahoo account, so they’re for me the most recent. Their signup form also refused to provide any guidance about why they were rejecting the form submission: I had to use my previous experience of sites wrongly rejecting valid email addresses to guess what the problem might be. Fail.


Footnote 1: According to my best reading of the relevant RFCs, anyway. See the definition of dot-atom in section 3.2.4 of RFC 2822, referenced in this context by section 3.4.1.

by
tonyg
on
17/03/09

Rant: Adventures with the Fisher Price My First Firewall

I’m writing this blog entry for therapeutic reasons. Everything you need to know is in the link below. Readers are invited to share the worst anti-features they have found in network devices by posting a comment.

I had a strange problem sending email from a host. I first discovered that trac couldn’t send messages via a remote smtp server. It would just hang indefinitely. So I decided it was better to set up exim on the local box, and have trac send mail using that - at least it wouldn’t hang.

Unfortunately, exim wouldn’t send messages either.

At this stage, we were using the same smtp server - exim was configured to use it as a smart host.

We discounted any firewall problems immediately, because we could establish a connection. We didn’t immediately notice that we didn’t get an initial message from the server. When we did, we assumed it was because the server wasn’t sending it for some reason, and started checking on things like DNS.

This got us nowhere.

Then I noticed that if I typed HELO into the connection I did get a response. Eventually I noticed I could type anything into the connection, and get the initial 220 back from exim.

At this point, I decided I would use tshark to check on what the smtp server was doing, and discovered that actually, it was sending the 220, and resending it a good few times too, it just never turned up at the end.

This turned my attention to the Zyxel firewall we were using.

It turns out that a ‘feature’ of the firewall designed to prevent spam prevented as receiving anything from the server on the connection until we had sent something on the connection. This feature is particularly ridiculous, since most spam mail clients don’t bother to try and synchronize with the server, so only spam would get through while legitimate clients would not.

We gather a firmware upgrade has solved this problem, but letting a firewall release into the wild without checking you could send email through it is a spectacular screw up - enough to convince me never to buy from this brand again, anyway.

Thanks Simon, for dubbing this product the ‘Fisher Price My First Firewall’.

Thanks Lucas Beeler for blogging about it here.

Thanks Zyxel for wrecking my day.

by
david
on
10/09/08

Rant: Smalltalk vs. Javascript; Diff and Diff3 for Squeak Smalltalk

Many of my recent posts here have discussed the diff and diff3 code I wrote in Javascript. A couple of weekends ago I sat down and translated the code into Squeak Smalltalk. The experience of writing the “same code” for the two different environments let me compare them fairly directly.

To sum up, Smalltalk was much more pleasant than working with Javascript, and produced higher-quality code (in my opinion) in less time. It was nice to be reminded that there are some programming languages and environments that are actually pleasant to use.

The biggest win was Smalltalk’s collection objects. Where stock Javascript limits you to the non-polymorphic

for (var index = 0; index < someArray.length; index++) {
  var item = someArray[index];
  /* do something with item, and/or index */
}

Smalltalk permits

someCollection do: [:item | "do something with item"].

or, alternatively

someCollection withIndexDo:
    [:item :index | "do something with item and index"].

Smalltalk collections are properly object-oriented, meaning that the code above is fully polymorphic. The Javascript equivalent only works with the built-in, not-even-proper-object Arrays.

Of course, I could use one of the many, many, many, many Javascript support libraries that are out there; the nice thing about Smalltalk is that I don’t have to find and configure an ill-fitting third-party bolt-on collections library, and that because the standard library is simple yet rich, I don’t have to worry about potential incompatibilities between third-party libraries, such as can occur in Javascript if you’re mixing and matching code from several sources.

Other points that occurred to me as I was working:

  • Smalltalk has simple, sane syntax; Javascript… doesn’t. (The number of times I get caught out by the semantics of this alone…!)
  • Smalltalk has simple, sane scoping rules; Javascript doesn’t. (O, for lexical scope!)
  • Smalltalk’s uniform, integrated development tools (including automated refactorings and an excellent object explorer) helped keep the code clean and object-oriented.
  • The built-in SUnit test runner let me develop unit tests alongside the code.

The end result of a couple of hours’ hacking is an implementation of Hunt-McIlroy text diff (that works over arbitrary SequenceableCollections, and has room for alternative diff implementations) and a diff3 merge engine, with a few unit tests. You can read a fileout of the code, or use Monticello to load the DiffMerge module from my public Monticello repository. [Update: Use the DiffMerge Monticello repository on SqueakSource.]

If Monticello didn’t already exist, it’d be a very straightforward matter indeed to build a DVCS for Smalltalk from here. I wonder if Spoon could use something along these lines?

It also occurred to me it’d be a great thing to use OMeta/JS to support the use of

<script type="text/smalltalk">"<![CDATA["
  (document getElementById: 'someId') innerHTML: '<p>Hello, world!</p>'
"]]>”</script>

by compiling it to Javascript at load-time (or off-line). Smalltalk would make a much better language for AJAX client-side programming.

by
tonyg
on
01/07/08

Rant: E4X: Not as awful as I thought

Long, long ago, I complained about various warts and infelicities in E4X, the ECMAScript extensions for generating and pattern-matching XML documents. It turns out that two of my complaints were not well-founded: sequence-splicing is supported, and programmatic construction of tags is possible.

Firstly (and I’m amazed I didn’t realise this at the time, as I was using it elsewhere), it’s not a problem at all to splice in a sequence of items, in the manner of Scheme’s unquote-splicing; here’s a working solution to the problem I set myself:

function buildItems() {
  return <>
           <item>Hello</item>
           <item>World!</item>
         </>;
}
var doc = <mydocument>{buildItems()}</mydocument>;

You can even use real Arrays (which is what I tried and failed to do earlier), by guerilla-patching Array.prototype:

Array.prototype.toXMLList = function () {
    var x = <container/>;
    for (var i = 0; i < this.length; i++) {
        x.appendChild(this[i]);
    }
    return x.children();
}
function buildItems() {
    return [<item>Hello</item>,
            <item>World!</item>].toXMLList();
}
var doc = <mydocument>{buildItems()}</mydocument>;

Programmatic construction of tags is done by use of the syntax for plain old unquote, in an unusual position: inside the tag’s angle-brackets:

var tagName = "p";
var doc = <{tagName}>test</{tagName}>;

So in summary, my original expectation that E4X should turn out to be very quasiquote-like wasn’t so far off the mark. It’s enough to get the basics done (ignoring for the minute the problems with namespace prefixes), but it’s still a bit of a bolt-on afterthought; it would have been nice to see it better integrated with the rest of the language.

by
tonyg
on
07/05/08

Rant: .NET is an endless supply of fascinating puzzles

In C, size_t is unsigned. In Java, there are no unsigned fixed-width pseudointegral types, so it can perhaps be forgiven for having an array’s length field be signed. In .NET, however, which has unsigned ints, an array’s length field is also signed. What could it possibly mean to have a length less than zero?

by
tonyg
on
19/09/07

Rant: Closing over context still not easy in mainstream languages, Film at 11

I find it fascinating that after so many decades of support for closures, we’re still stuck in a C-style mentality of passing function-pointers that take an explicit context argument rather than a proper closure object. Witness the design of .NET’s Type.FindInterfaces method:

public virtual Type[] FindInterfaces (TypeFilter filter,
                                      Object filterCriteria);

The TypeFilter argument is a delegate. The Object argument is context that the delegate may require! This is pretty much exactly the old-school C-style way of implementing closures:

/* Yes, pretty crude translation, I know */
TypeArray find_interfaces(int (*type_filter)(Type*, void*),
                          void *argument);

Smalltalk (and Lisps) would do it in the natural way, with a block (a closure):

someType selectInterfaces: [:interface | ... ]

Lisp 1.5, complete with support for lexical closures, appeared in 1959. It’s 2007. That’s forty-eight years.

by
tonyg
on
11/09/07
2000-9 LShift Ltd, 1st Floor Office, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK +44 (0)20 7729 7060