Archive for June 24th, 2006

E4X: I want my S-expressions back

E4X is a new ECMA standard (ECMA-357) specifying an extension to ECMAScript for streamlining work with XML documents.

It adds objects representing XML to ECMAScript, and extends the syntax to allow literal XML fragments to appear in code. It also supports a very XPath-like notation for use in extracting data from XML objects. So far, so good - all these things are somewhat useful. However, there are serious problems with the design of the extension.

If E4X objects were real objects, if there were a means of splicing a sequence of child nodes into XML literal syntax, and if E4X supported XML namespace prefixes properly, most of my objections would be dealt with. As it stands, the overall verdict is “clunky at best”.

These are my main complaints:

  • It doesn’t do anything like Scheme’s unquote-splicing, and so using E4X to produce XML objects is verbose, error-prone and dangerous in concurrent settings.

    There seems to be no way of splicing in a sequence of items - I’d like to do something like the following:

    function buildItems() {
      return [<item>Hello</item>,
              <item>World!</item>];
    }
    var doc = <mydocument>{buildItems()}</mydocument>;
    

    and have doc contain

    <mydocument>
      <item>Hello</item>
      <item>World!</item>
    </mydocument>
    

    What actually results is the more-or-less useless

    <mydocument>Hello,World!</mydocument>
    

    The closest I can get to the result I’m after is

    function buildItems(n) {
      n.mydocument += <item>Hello</item>;
      n.mydocument += <item>World!</item>;
    }
    var doc = <mydocument></mydocument>;
    buildItems(doc);
    
  • It’s full of redundant redundancy - it’s as verbose as XML, when you can do so much better.

  • There’s no toXML() method (or similar) for use in papering over the yawning chasm between the XML objects and the rest of the language: you can’t even make a Javascript object able to seamlessly render itself to XML.

  • The new types E4X introduces aren’t even proper objects - they’re a whole new class of primitive datum!

  • Because they’re not proper objects, you can’t extend the system. You ought to be able to implement to an interface and benefit from the language’s XPath searching and filtering operations. E4X is so close to offering a comprehension facility for Javascript, but it’s been short-sightedly restricted to a single class of non-extensible primitives.

  • You can’t even construct XML tags programmatically! If the name of the tag doesn’t appear literally in your Javascript code, you’re out of luck (unless you resort to eval…) [[Update: I was wrong about this - you can write <{expr}> and have the result of evaluating expr substituted into the tag.]]

  • E4X XML objects have no notion of namespace prefixes (which are required for quality implementations of XPath and anything to do with XML signatures). Prefixes only turn up in the API as a means of producing (namespaceURI,localname) pairs. This is actually how it should be, but because there’s already broken software out there that depends on prefix support, by not supporting prefixes properly you preclude ECMAScript+E4X from being used for XML signatures or ECMAScript-native XPath implementations.

In my opinion, E4X violates several programming language design principles: most importantly, those of regularity, simplicity and orthogonality, but also preservation of information, automation and structure. SXML, perhaps in combination with eager comprehensions, provides a far superior model for producing and consuming XML. Sadly, there’s no real alternative for ECMAScript yet - we’re limited either to library extensions, or to using the DOM without any syntactic or library support at all.

5 comments June 24th, 2006 tonyg

Bruce J. MacLennan’s Programming Language Design Principles

As far as I can tell, no-one on the web has yet summarised in a single page all of the design principles MacLennan develops in his excellent Principles of Programming Languages (2nd edition, 1986, ISBN 0-03-005163-0): so here they are. I hope they’re as useful to others as they have been to me.

  • Abstraction: Avoid requiring something to be stated more than once; factor out the recurring pattern.
  • Automation: Automate mechanical, tedious, or error-prone activities.
  • Defense in Depth: Have a series of defences so that if an error isn’t caught by one, it will probably be caught by another.
  • Information Hiding: The language should permit modules designed so that (1) the user has all of the information needed to use the module correctly, and nothing more; and (2) the implementor has all of the information needed to implement the module correctly, and nothing more.
  • Labeling: Avoid arbitrary sequences more than a few items long. Do not require the user to know the absolute position of an item in a list. Instead, associate a meaningful label with each item and allow the items to occur in any order.
  • Localised Cost: Users should only pay for what they use; avoid distributed costs.
  • Manifest Interface: All interfaces should be apparent (manifest) in the syntax.
  • Orthogonality: Independent functions should be controlled by independent mechanisms.
  • Portability: Avoid features or facilities that are dependent on a particular machine or a small class of machines.
  • Preservation of Information: The language should allow the representation of information that the user might know and that the compiler might need.
  • Regularity: Regular rules, without exceptions, are easier to learn, use, describe, and implement.
  • Security: No program that violates the definition of the language, or its own intended structure, should escape detection.
  • Simplicity: A language should be as simple as possible. There should be a minimum number of concepts, with simple rules for their combination.
  • Structure: The static structure of the program should correspond in a simple way to the dynamic structure of the corresponding computations.
  • Syntactic Consistency: Similar things should look similar; different things different.
  • Zero-One-Infinity: The only reasonable numbers are zero, one, and infinity.

2 comments June 24th, 2006 tonyg

Overview of Javascript modes for Emacs

Emacsen.org has a nice roundup of the (apparently only) four javascript-mode implementations for Emacs. I went for number three, Karl Landström’s javascript.el, and it’s been working very well.

1 comment June 24th, 2006 tonyg

Calendar

June 2006
M T W T F S S
« May   Jul »
 1234
567891011
12131415161718
19202122232425
2627282930  

Posts by Month

Posts by Category