technology from back to front

Hylomorphisms through lazy streams

A fold is a higher-order function familiar to many programmers. It’s a function that takes some structure as input, and some function that converts part of that structure into a value. Evaluating the fold then recursively applies the given function to the structure, resulting in some return value. Variants exist which supply an initial “seed” value, or just use the first element of the structure as seed. For instance, given a list of numbers #(1 2 3 4 5), and a folding function #+, #(1 2 3 4 5) inject: 0 into: #+ will calculate the sum of the numbers.

Perhaps less well known is the unfold, at which we are going to look today.

If you read the previous link you’d see the term “anamorphism”, described as the dual or complement or inverse of a “catamorphism”. Some folks like to talk fancy, I suppose. Unlike fold, which has a multitude of different non-Greek names (reduce, inject, and so on), I have yet to come across synonyms for unfold.

So. The name gives a clue: an unfold takes some value, and a function that converts a value of that type into some structure, and then evaluating the unfold recursively, well, unfolds the value into a structure. Ah, and there’s an important piece missing. With a fold, we recursively apply some function to a structure, and progressively fold it into a simpler/smaller structure, until it disappears and we have some value. There’s no such intrinsic “end here” for an unfold, so if we wish to end up with a finite structure, we need to supply one additional parameter: a predicate that says “OK, finished.” In summary, the parts we need to give an unfold are:

  1. A seed value,
  2. a function expanding a value into some structure,
  3. a function transforming the seed in some manner,
  4. a function telling us when we’re done.

Since I’ve been playing around with Michael Lucas-Smith and Martin Kobetic’s XTreams library on another project (about which I hope to blog in the near future), let’s use that as a starting point.

Verbs in the Smalltalk collection API follow an -ect pattern – collect, select, reject, inject, detect, etc. – so let’s follow in that grand old tradition. Given that we have to supply a next-seed and we’re-done predicate, #eject:nextSeed:until: Seems like a good name. Oh, and XTreams uses present participles rather than verbs to emphasise that we’re working with streams of data, so let’s use #ejecting:nextSeed:until:.

Name in hand, let’s go. First, we need to describe the behaviour we want to see: what should this new-fangled thing actually do?

"Turn a number into a (descending) list of numbers."
self assert: #(3 2 1 0)
    equals: (4
        ejecting: [:n | n - 1] "At each step, return the predecessor of the current seed."
         nextSeed: [:n | n - 1] "Count down"
         until: [:result | result = 0]) rest.

What’s that rest? If you recall, the last time we looked at Xtreams we saw that #injecting:into: returns a stream of values, showing us the interim results of the fold. If we only care about the final fold, we invoke #rest, which will run the computation to completion, and give us the final result in the last value of the stream.

Object >> ejecting: expansionBlock nextSeed: seedBlock until: predicateBlock
    "Return an XTReadStream that progressively turns self into some
     structure. expansionBlock tells us how to unfold one step, seedBlock
     tells us how to get the next seed value, and predicateBlock returns
     true when the unfolding has finished."

    ^ [:out | | seed |
        seed := self.
        [predicateBlock value: seed] whileFalse:
            [out put: (expansionBlock value: seed).
            seed := seedBlock value: seed]] reading.

Note that the above looks just like a normal do-while loop, yet the result is a stream. What gives? XTreams sees that the block takes an argument, and so instantiates an XTBlockClosureGenerateStream. This stream passes itself into the block – into the out parameter – and every time you write to it – out put: (Expansionblock value: seed) – it blocks execution of the loop and returns that value.

Unfold in hand, we can now do amazingly useful things like combining an unfold and a fold to transform an integer into its binary representation:

    ejecting: [:n | n // 2]
    nextSeed: [:n | n // 2]
    until: [:n | n < 1])
        collecting: [:n | (n \\ 2) printString])
            injecting: '' into: [:l :r | r , l]) rest. "Note that we PREPEND the bit."
"=> #('1' '01' '101' '0101' '10101' '010101')"

And that, if one wishes to amaze and confound your peers, is what we call a hylomorphism.

Frank Shearar

× seven = 7

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us