The algebraic origins of fold and unfold
June 5th, 2006 francis
A while back, an email from a collaborator proposed a coalgebraic view of the grammar for a new process calculus. This prompted me to do some reading up on algebras versus coalgebras - something I had been meaning to do for a long while.
Luckily, there is a very good tutorial by Bart Jacobs and Jan Rutten. At 38 pages, it’s a bit intimidating but it’s very accessible for the practicing computer scientist (essentially, exactly what it says in its introduction). The examples are helpful but it still deserves several read-throughs to really get the most out of it – each time I’ve scanned through it, I’ve found something new!
So what’s it about? For a few decades now, computer scientists have been familiar with tools such as BNF for defining language grammars, and
- defining functions over such grammars by cases, and
- proving their properties by structural induction.
Viewed algebraicly, the BNF grammar induces a functor T, for which
there exists an initial algebra a: T X --> X. Initiality
of a gives us the fold function, which allows us to define
functions by cases, as well as the proof-by-induction proof
principle. These algebraic concepts are, in general, better known
than the coalgebraic ones.
In this article, I’d like to share with you, a concrete illustration
of the first of the two points above, namely fold, and its dual
unfold.
Let’s turn to a concrete example: lists over int. In this case, we
have the grammar
list ::= nil | cons(n, list)
This induces a functor T defined by T X = 1 + int * X, where 1
denotes the singleton set, + denotes disjoint union and * denotes
cartesian product.
An algebra for this T (also called a T-algebra) is a carrier set
X with function a: 1 + int * X --> X. For the remainder of this
example, for convenience, we’ll work with the equivalent pair of
functions of types 1 --> X and int * X --> X.
The functional programmers amongst you will instantly recognise these
type signatures and think of the algebra X = list, the type of
integer lists, with functions nil: 1 --> X and cons: int * X -->
X. In fact, list, nil and cons make up the initial algebra for
this T.
Let’s consider another algebra int, knil, kons:
knil: 1 --> int
kons: int * int --> int
Since nil, cons is an initial algebra, and knil, kons is also
an algebra, by initiality, there is a unique function f of type
list --> int which makes the following diagram commute.
T f
T list = 1 + int * list --------------> 1 + int * int = T int
| |
| |
| nil,cons | knil,kons
| |
| |
V V
list ---------------------> int
f
This f is usually called fold knil kons. (This fold is actually
fold-right rather than fold-left.)
Now if it so happens that
knil = 0
kons x y = x + y
then fold knil kons is precisely what you’d expect, namely the list
sum function of type list --> int. Initiality allows us to define
list sum by cases, aka structural recursion: we give the nil case
and the cons case.
A coalgebra is a carrier set X with a function c of type X --> 1
+ int * X, i.e. X --> T X (note that the T has moved to the other
side of the arrow). Function c takes an element of X and returns
either something in the left of the disjoint union (i.e. the element
in 1) or something in the right (i.e. a pair in int * X). One
intuition is to think of c as a destructor — it takes apart
elements from X.
The function c can equivalently be considered as three functions p:
X --> bool, f: X --> int and g: X --> X related to c as
follows: if p x is true then c maps x to the element in the left
of the disjoint union; if p x is false then c maps x to the pair
(f x, g x) in the right. We assume without loss of generality that
p,f,g is equivalent to c.
The final coalgebra (which is the dual of initial algebra) for our
example is the type of possibly infinite lists, stream, and the
triple isnil?,head,tail:
isnil?: stream --> bool
head: stream --> int
tail: stream --> stream
Now consider nat, the type of natural numbers, as a coalgebra with
functions iszero?, odd, rshift:
iszero?: nat --> bool
odd: nat --> int
rshift: nat --> nat
Finality of isnil?, head and tail means there is a unique
function f of type nat --> stream which makes the following
diagram commute.
f
nat ---------------------> stream
| |
| |
| iszero?,odd,rshift | isnil?,head,tail
| |
| |
V V
T nat = 1 + int * nat ------------> 1 + int * stream = T stream
T f
The function denoted by the top horizontal arrow in the commuting
square above is usually called unfold iszero? odd rshift.
Now if it so happens that odd returns 1 on odd inputs and 0
otherwise, and rshift returns the binary right shift of its input,
then unfold iszero? odd rshift is the function which takes a natural
number and returns its binary digits in reverse (but note that 0
gives the empty list). (Compare this example to Tony’s base
conversion code.)
So wrapping up, we’ve illustrated the origins of fold and unfold
for integer lists from the algebraic perspective. It is possible to
generalise this example and thus derive fold and unfold for other
recursive data types such as binary trees.
Entry Filed under: Technology
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed