Implementing maps and folds using Zippers
Zippers continue to fascinate me. Let’s recap a bit: a zipper is a data structure representing the navigation and mutation of an otherwise immutable structure. We’ve looked at several different implementations here on the LShift blog, in several different languages.
Today, we’re going to see what else we can do with zippers.
When we implement a Huet-style zipper (in other words, we use a recursive structure based on one-hole contexts instead of directly using partial continuations), we implement a bunch of primitive behaviours – left, right, replace, insert child, and so on. Let’s assume we have all that.
The first obvious thing to do is to define a traversal. We need two essential elements for a traversal – a means of knowing when we’re done, and a means of moving to the next element. Now there’s a teeny hitch. Given a zipper over some hierarchical structure, it seems most natural to describe our traversal in terms of recursion – after all, we’re working with a recursive structure. Ruby doesn’t require proper tail call elimination (even though, apparently, some implementations support it). We don’t want stack overflows from too-deep recursion, so we’ll reach into our toolbox and haul out a trampoline:
def trampoline(initial_value, &unary_block)
result = unary_block.call(initial_value)
while result.kind_of?(Proc) do
result = unary_block.call(result.call)
end
result
end
A trampoline will take some block, and invoke it with the given initial value. If the block yields a
Proc
(what the Lispers would call a thunk – a parameterless
Proc
that delays the evaluation of some chunk of code), it will invoke the block with this
Proc
, and repeat that process until the block yields something that’s not a
Proc
. Finally, it will return that non-
Proc
value. With tool in hand, we can now write a recursive pre-order traversal, confident that we won’t blow our stack even on large structures. Note that this particular zipper uses “safe” navigation, wrapping up results in the Either monad.
# Perform a pre-order traversal, that is "this node, then a pre-order
# traversal of my children, left to right".
def next
if not has_next? then
return @zipper
end
if @zipper.branch?(@zipper.value) then
return @zipper.down
end
right_sibling = @zipper.safe_right
if right_sibling.right? then
return right_sibling.value
end
# Return a Zipper if there's a next, with a distinguishable context
# for the last element of the traversal.
# This algorithm returns a thunk when it wishes to recurse.
# The trampoline converts this CPS-like algorithm into one
# that runs in constant space.
trampoline(@zipper) { |z|
parent = z.safe_up
parent.either(->parent_z{
uncle = parent_z.safe_right
uncle.either(->z{ next z},
->unused_error{ next ->{parent_z}}) # Recur
},
->unused_error{
# We've popped up the structure all the way to the root node.
z.new_zipper(z.value, EndOfTraversalContext.new(z.context))
})
}
end
It’s easy to wrap up
next
inside some object, and simply invoke the entire traversal with
each
:
it "should process all nodes in a pre-order fashion" do
tree = Tree.new(1, [Tree.new(2, [Tree.new(3, [])]),
Tree.new(4, [Tree.new(5, []),
Tree.new(6, [])])])
t = PreOrderTraversal.new(tree.zipper)
answers = []
t.each { |value|
answers << value.value
}
answers.should == [1, 2, 3, 4, 5, 6]
end
where
each
is defined thusly:
def each(&block)
map { |node|
block.call(node)
node
}
end
# Return a same-shaped structure with the relevant mapping performed on
# each node.
def map(&unary_block)
# It's ridiculous to store the previous zipper to avoid a one-past-the-end
# error. It works, and it's simple, but it's _ugly_.
prev = @zipper
while has_next? do
@zipper = @zipper.replace(unary_block.call(@zipper.value))
prev = @zipper
@zipper = self.next
end
prev
end
We could just mix in
Enumerable
and get
map
and a whole lot more For Free… but
Enumerable#map
returns an
Array
, and I want
map
to return something of whatever type it was fed.
Of course there’s nothing stopping us from implementing other traversals – post-order, in-order (for binary trees), or even breadth-first. What’s nice is that none of these traversal strategies have any knowledge about the structure over which they move! They just know about the zipper… and the zipper knows nothing about the structure, except
- Can this node have children?
- What are this node’s children?
- Given a parent node and children nodes, how do I make a new node?
With a traversal in hand, we’re free to jump into the land of higher order functions – map, fold, and the like. Now of course, as we just saw, these traversals know nothing about the underlying structure. That means that while we can implement a fold -
# Collapse some structure into some kind of value using an initial value,
# and a binary block taking the thus-far-computed value (accumulator) and
# the current node.
def fold(initial_value, &binary_block)
accumulator = initial_value
each { |node|
accumulator = binary_block.call(accumulator, @zipper.value)
}
accumulator
end
- we will obviously have to tell the fold what to do. Actually, this is no different from your usual fold in Common Lisp or Smalltalk or Haskell. So given some kind of tagged tree with
Node
and
Leaf
elements, we could have
it "should permit the folding of a structure according to a given block" do
t = Node.new(:root, [Node.new(:left_subchild, [Leaf.new(1)]), Leaf.new(2)])
PreOrderTraversal.new(t.zipper).fold(0) { |sum, node|
sum + case node
when Node then 0
when Leaf then node.value
end
}.should == 3
end
So here we see how to apply all manner of transformations – mapping a tree to another tree of the same shape, folding a tree, insert our fondest desire – with the core mechanisms remaining firmly decoupled from our types.
