technology from back to front

Clojure to Smalltalk translation notes

Clojure has an interesting implementation of a Huet-style zipper. I started translating it to Smalltalk, and in the process discovered a number of things not really related to zippers. Given that the end result ends up looking very similar to something we’ve already seen , let’s talk more about the translation process itself.

1. Abstract binding

Destructuring is a way of dividing up some kind of structure. In Prolog one might say

[H|T] = [1,2,3]

to split up a list into a head

H

(unifying with

1

) and a tail

T

(unifying with

[2,3]

). Destructuring’s at the core of Scala’s pattern matching (through the use of the Extractor pattern and some compiler magic). Clojure has a number of very handy destructuring tools. Why do we care about handy destructuring? Consider the following two snippets of code:

    (let [[c & cnext :as cs] (children loc)] ...)

Which in Smalltalk becomes the more verbose, and less readable:

    | c cnext cs newPath |
    cs := self children: node.
    c := cs first.
    cnext := cs allButFirst.
    ...

When one takes into account the fact that Clojure functions quite happily consume

nil

, and how empty lists behave, a precise duplication of the

let

becomes something as… inelegant as:

    | c cnext cs newPath |
    cs := self children: node.
    cs notNil and: [cs notEmpty]
        ifTrue: [
            c := cs first.
            cnext := cs allButFirst]
        ifFalse: [
            c := nil.
            cnext := nil]
    ...

But perhaps there’s a way to leverage unification to bring the same kind of brevity to Smalltalk…

2. Pluggable behaviours to traverse arbitrary hierarchical structures

zip.clj attaches the zipper data – how one got to a node, what’s changed during the traversal, examining the target data structure – to the nodes of that structure with Clojure’s metadata. We pass in details specific to the structure as lambdas. We only need three things from the structure:
  1. Does this node have children? (or: is this node a branching point?)
  2. What are this node’s children?
  3. How can we construct a new node?
If we revisit the

TreeZipper

, we see the same three questions answered in a different way:

  1. This node has children if

    focus children notEmpty

    is true.

  2. The node’s children are returned by

    focus children

    .

  3. We construct a new node by invoking

    TreeZipper >> #newFocusOn:children:

    .

To put it another way: we accomplish the same thing – abstracting the details of a structure – by passing in lambdas in Clojure, and by subclassing and overriding in Smalltalk.

3. Dictionary: world’s most useful data structure?

A colleague of mine reckons that a dictionary’s the world’s most useful structure. After looking at zip.clj I think I see why: zip.clj’s dictionaries map names to values in the same manner an object does… except you don’t need to define a class before you start playing around, and it’s trivial to add new fields. (There’s a trade-off here: using a dictionary means never having to add yet another property to your data object – rapid prototyping – while using a data object means never having to trawl source code wondering what properties you’re using – self-documenting source.)

One of the really neat features of a Clojure dictionary is the ease with which one accesses values. Want the right context of your zipper’s current node? All you need is

(:r (loc 1))

. If there’s no such key, you get

nil

.

It’s possible, with a caveat, to duplicate a Clojure-like dictionary in Smalltalk. In particular, we want something such that

    CljDictTest >> testArbitraryNamesReturnNil
        self assert: nil equals: CljDict new foo.

    CljDictTest >> testArbitraryNamesTurnIntoMaps
        | dict |
        dict := CljDict new.
        dict foo: 1.
        self assert: 1 equals: dict foo.

And this is all you need:

    Object subclass: #CljDict
        instanceVariableNames: 'dict'

    CljDict >> initialize
        dict := Dictionary new.

    CljDict >> doesNotUnderstand: aMessage
        ^ aMessage numArgs
            caseOf: {
                [0] -> [self get: aMessage selector].
                [1] -> [self set: aMessage selector to: aMessage arguments first].}
            otherwise: [super doesNotUnderstand: aMessage].

    CljDict >> get: aSymbol
        ^ dict at: aSymbol ifAbsent: [nil].

    CljDict >> set: aSymbol to: anObject
        "aSymbol is the name of a setter method."
        ^ dict at: aSymbol allButLast put: anObject.

As you can see, the caveat is this: we subclass

Object

, and

Object

has a fairly extensive vocabulary. Attempting to use a key that matches a selector in

Object

’s vocabulary will cause surprising behaviour, in that you will see the result of

Object

doing something, not simply the access of a value (through

#doesNotUnderstand:

). We could subclass

ProtoObject

, but usually we only do that when we absolutely have to.

As it turns out, the dictionary in zip.clj maps isomorphically to a

TreeZipperNode

:

Datum zip.clj dictionary TreeZipperNode
Left context

:l

#leftNodes

Right context

:r

#rightNodes

History of visited nodes

:pnodes

#value

[1]

Previous context

:ppath

#path

[1] Really, you need to collect the nodes yourself by recursing over the chained

TreeZipperNode

s.

Did I say isomorphically? zip.clj does use another key,

:changed?

. As it turns out, this reveals a bug in

TreeZipper

! The assertion

zipper root == zipper down up root

will fail, because

TreeZipper

will blindly create a new node, even for a node that hasn’t changed! And it’s by storing the previously visited nodes in

:pnodes

that zip.clj can traverse a structure and re-root, returning the identical structure.

by
Frank Shearar
on
18/10/11
 
 
2000-13 LShift Ltd, 1st Floor Office, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK +44 (0)20 7729 7060