technology from back to front

Implementing maps and folds using Zippers

Zippers continue to fascinate me. Let’s recap a bit: a zipper is a data structure representing the navigation and mutation of an otherwise immutable structure. We’ve looked at several different implementations here on the LShift blog, in several different languages.

Today, we’re going to see what else we can do with zippers.

When we implement a Huet-style zipper (in other words, we use a recursive structure based on one-hole contexts instead of directly using partial continuations), we implement a bunch of primitive behaviours – left, right, replace, insert child, and so on. Let’s assume we have all that.

The first obvious thing to do is to define a traversal. We need two essential elements for a traversal – a means of knowing when we’re done, and a means of moving to the next element. Now there’s a teeny hitch. Given a zipper over some hierarchical structure, it seems most natural to describe our traversal in terms of recursion – after all, we’re working with a recursive structure. Ruby doesn’t require proper tail call elimination (even though, apparently, some implementations support it). We don’t want stack overflows from too-deep recursion, so we’ll reach into our toolbox and haul out a trampoline:

    def trampoline(initial_value, &unary_block)
      result = unary_block.call(initial_value)
      while result.kind_of?(Proc) do
        result = unary_block.call(result.call)
      end
      result
    end

A trampoline will take some block, and invoke it with the given initial value. If the block yields a Proc (what the Lispers would call a thunk – a parameterless Proc that delays the evaluation of some chunk of code), it will invoke the block with this Proc, and repeat that process until the block yields something that’s not a Proc. Finally, it will return that non-Proc value. With tool in hand, we can now write a recursive pre-order traversal, confident that we won’t blow our stack even on large structures. Note that this particular zipper uses “safe” navigation, wrapping up results in the Either monad.

    # Perform a pre-order traversal, that is "this node, then a pre-order
    # traversal of my children, left to right".
    def next
      if not has_next? then
        return @zipper
      end

      if @zipper.branch?(@zipper.value) then
        return @zipper.down
      end

      right_sibling = @zipper.safe_right
      if right_sibling.right? then
        return right_sibling.value
      end

      # Return a Zipper if there's a next, with a distinguishable context
      # for the last element of the traversal.
      # This algorithm returns a thunk when it wishes to recurse.
      # The trampoline converts this CPS-like algorithm into one
      # that runs in constant space.
      trampoline(@zipper) { |z|
        parent = z.safe_up
        parent.either(->parent_z{
                        uncle = parent_z.safe_right
                        uncle.either(->z{ next z},
                                     ->unused_error{ next ->{parent_z}}) # Recur
                      },
                      ->unused_error{
                        # We've popped up the structure all the way to the root node.
                        z.new_zipper(z.value, EndOfTraversalContext.new(z.context))
                      })
      }
    end

It’s easy to wrap up next inside some object, and simply invoke the entire traversal with each:

    it "should process all nodes in a pre-order fashion" do
      tree = Tree.new(1, [Tree.new(2, [Tree.new(3, [])]),
                          Tree.new(4, [Tree.new(5, []),
                                       Tree.new(6, [])])])
      t = PreOrderTraversal.new(tree.zipper)
      answers = []
      t.each { |value|
        answers << value.value
      }
      answers.should == [1, 2, 3, 4, 5, 6]
    end

where each is defined thusly:

    def each(&block)
      map { |node|
        block.call(node)
        node
      }
    end

    # Return a same-shaped structure with the relevant mapping performed on
    # each node.
    def map(&unary_block)
      # It's ridiculous to store the previous zipper to avoid a one-past-the-end
      # error. It works, and it's simple, but it's _ugly_.
      prev = @zipper
      while has_next? do
        @zipper = @zipper.replace(unary_block.call(@zipper.value))
        prev = @zipper
        @zipper = self.next
      end
      prev
    end

We could just mix in Enumerable and get map and a whole lot more For Free… but Enumerable#map returns an Array, and I want map to return something of whatever type it was fed.

Of course there’s nothing stopping us from implementing other traversals – post-order, in-order (for binary trees), or even breadth-first. What’s nice is that none of these traversal strategies have any knowledge about the structure over which they move! They just know about the zipper… and the zipper knows nothing about the structure, except

  • Can this node have children?
  • What are this node’s children?
  • Given a parent node and children nodes, how do I make a new node?

With a traversal in hand, we’re free to jump into the land of higher order functions – map, fold, and the like. Now of course, as we just saw, these traversals know nothing about the underlying structure. That means that while we can implement a fold -

    # Collapse some structure into some kind of value using an initial value,
    # and a binary block taking the thus-far-computed value (accumulator) and
    # the current node.
    def fold(initial_value, &amp;binary_block)
      accumulator = initial_value
      each { |node|
        accumulator = binary_block.call(accumulator, @zipper.value)
      }
      accumulator
    end

- we will obviously have to tell the fold what to do. Actually, this is no different from your usual fold in Common Lisp or Smalltalk or Haskell. So given some kind of tagged tree with Node and Leaf elements, we could have

    it "should permit the folding of a structure according to a given block" do
      t = Node.new(:root, [Node.new(:left_subchild, [Leaf.new(1)]), Leaf.new(2)])
      PreOrderTraversal.new(t.zipper).fold(0) { |sum, node|
        sum + case node
                when Node then 0
                when Leaf then node.value
              end
      }.should == 3
    end

So here we see how to apply all manner of transformations – mapping a tree to another tree of the same shape, folding a tree, insert our fondest desire – with the core mechanisms remaining firmly decoupled from our types.

by
Frank Shearar
on
23/12/11
 
 


1 + nine =

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us