technology from back to front

Transforming Ruby DSLs

Ruby excels at “embedded” DSLs – domain specific languages that are simultaneously plain Ruby and yet distinctly their own. RSpec springs to mind as an excellent example. At any rate, I have a DSL that recently underwent a fairly invasive change, and I wanted to automate moving model descriptions from the old format to the new.

The DSL describes domain objects in a language-agnostic fashion. For example:

in_namespace('Outer') {
  in_namespace('Inner') {
    define('An Inner Object') {
      field 'name', :string
      field 'value', :integer
    }
  }
  define('An Outer Object') {
    field 'name', :string
    field 'wrapping', 'Inner.An Inner Object'
  }
}

defines a pair of objects, one designed to wrap around the other. Using this language, one can generate data objects in some language of your choice: Smalltalk, perhaps, or Haskell.

The DSL has one extra bit: because the language describes a wire contract between two services, it’s sensible to version the entities, so that one can quickly see that two services have gotten out of sync: one or the other needs upgrading, for instance. So our domain model might look like this:

in_namespace('Outer', :version => "1.0") {
  in_namespace('Inner') {
    define('An Inner Object') {
      field 'name', :string
      field 'value', :integer
    }
  }
  define('An Outer Object', :version => "2.0") {
    field 'name', :string
    field 'wrapping', 'Inner.An Inner Object'
  }
}

This works pretty well: the wrapping object has version “2.0″ while the wrapped object inherits its version – “1.0″ – from its namespace. Imagine that you have some namespace with a large number of entities. It’s very easy to accidentally upgrade all objects in a namespace when you only meant to upgrade some. So the per-namespace versioning needed to disappear, forcing the user to explicitly express all versions on a per-definition basis. Clearly, in a large model, that would be an onerous task. However, this is still plain Ruby, and there just happens to be a Ruby->AST translator lying around, so it makes sense to manipulate the AST directly, and not mess around with crazy nonsense like trying to use regular expressions or similar. And given that I just happen to have a Ruby zipper library capable of traversing arbitrary hierarchical structures, a plan forms…

Getting the AST is simplicity itself:

require 'rubygems'
require 'sexp'
require 'parse_tree'
require 'zipr/zipper'


filename = "my-model.rb"
file = File.read(filename)
sexp = Sexp.from_array(ParseTree.new(false).parse_tree_for_string(file, filename).first)
# I use ParseTree here, but really you want to use ruby_parser:
# it works in Ruby 1.9, for starters.

which gives us a lovely S-expression:

s(:iter,
 s(:fcall,
  :in_namespace,
  s(:array, s(:str, "Outer"), s(:hash, s(:lit, :version), s(:str, "1.0")))),
 nil,
 s(:block,
  s(:iter,
   s(:fcall, :in_namespace, s(:array, s(:str, "Inner"))),
   nil,
   s(:iter,
    s(:fcall, :define, s(:array, s(:str, "An Inner Object"))),
    nil,
    s(:block,
     s(:fcall, :field, s(:array, s(:str, "name"), s(:lit, :string))),
     s(:fcall, :field, s(:array, s(:str, "value"), s(:lit, :integer)))))),
  s(:iter,
   s(:fcall,
    :define,
    s(:array,
     s(:str, "An Outer Object"),
     s(:hash, s(:lit, :version), s(:str, "2.0")))),
   nil,
   s(:block,
    s(:fcall, :field, s(:array, s(:str, "name"), s(:lit, :string))),
    s(:fcall,
     :field,
     s(:array, s(:str, "wrapping"), s(:str, "Inner.An Inner Object")))))))

All that remains is (a) transforming the AST and (b) writing the AST out again as Ruby source. The latter’s pretty boring, so let’s just stick with the former. We need to do two things: select certain nodes, and transform them. So given some helper methods added to Sexpr and Enumerable, we need just say

class Zipr::Zipper
  def version_all_objects(default_version)
    current_version = default_version

    # Recall that Zipper's map returns a structure of the same
    # shape as that over which we iterate (as opposed to
    # Enumerable's map, which returns a list of mapped things).
    map {|n|
      if n.kind_of?(Enumerable) then
        if (n.first == :fcall) and (n[1] == :in_namespace)
          # Possibly change current_version here
          # Remove the per-namespace version, if present
          if (n.has_version) then
            current_version = n.version_from_ns
            next n.strip_version_from_ns
          end
        elsif (n.first == :fcall) and (n[1] == :define)
          # Inject the version, if there isn't one already
          if (not n.has_version) then
            next n.add_version_to_object(current_version)
          end
        end
      end
      n
    }
  end
 
  puts SexpWriter.new.visit(version_all_objects(z).root)

which yields the desired, altered, model definition:

in_namespace('Outer') {
  in_namespace('Inner') {
    define('An Inner Object', :version => "1.0") {
      field 'name', :string
      field 'value', :integer
    }
  }
  define('An Outer Object', :version => "2.0") {
    field 'name', :string
    field 'wrapping', 'Inner.An Inner Object'
  }
}

Addendum: what a context’s path means

In the process of writing the above, I stumbled across some bugs in the zipr library. First, I realised that a traversal of a complicated tree would have multiple downs and ups. That caused a problem because originally “rooting” the zipper – committing all the edits to build your new structure – would detect that an edit had occured and invoke a special “root a changed structure” path. If one went up after an edit, this “something has changed” got lost. But the second issue turned out to be another instance of the same basic problem.

So we know a zipper has two parts, namely a one-hole context, and a value to plug that hole. We also know that a one-hole context is made up of a few parts: nodes to the left of the current focus point, nodes to the right, and so on. One of these pieces of information is the path from the current focus back to the root of the structure – the trail of breadcrumbs, as Learn You a Haskell calls it. Suppose we’ve just edited a node, and our path consists of a sequence of nodes a0, a1, …, an. If we move up, to node an, the context around that node has not changed. If we root the structure, the rooting process won’t know that something’s changed, and we’ve lost our edit. Suppose instead we move to the left or right. Our parent node is still an, but note that that path does not include the changed node. Again, rooting the structure does not pass through any context that says “hey, wait a minute, something’s changed”, and again our edit is lost!

by
Frank Shearar
on
27/02/12
 
 


× 8 = forty

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us