technology from back to front

“Duck-finding” for testing your Theories

A while ago I wrote a semi-port of Haskell’s QuickCheck. Easy enough – a property is like a test method but with arity 1, into which you inject data – potential counterexamples to your theory. In Haskell, the type system can, through unification, figure out the type of the generator required for that property. What to do in a dynamic language?

There are a number of type inference techniques for dynamic languages – k-CFA, demand-driven type inferencing with subgoal pruning, RoelTyper. I’m going to use a very simple technique.

First, some terminology. In Smalltalk, “protocol” usually means one of two things: either “what messages does this object understand?” or “does this object understand the Foo protocol?”, where Foo might be “Stream”, or “Collection”. We’re going to use the latter meaning. In particular, given a JUnit-like theory, we want an answer to the question “What objects – what instances of what classes – satisfy the protocol sent to the datum injected into this theory?”

With a decompiler to hand, it’s easy enough to generate an AST of the theory over which we can walk: walk the MessageNodes and look for things with a receiver 't1' which will be the name given to the first temporary variable, i.e., the argument of the unary method. This won’t work for anything hidden through a #perform: – think of this as the eval of Smalltalk code.

ParseNodeVisitor subclass: #SenderToArgCollector
instanceVariableNames: 'selectors classSelectors'
classVariableNames: ''
poolDictionaries: ''
category: 'SqueakCheck-SUnit'

visitMethodNode: aMethodNode
    classSelectors := Set new.
    selectors := Set new.
    ^ super visitMethodNode: aMethodNode.

visitMessageNode: aMessageNode
    (aMessageNode receiver name = 't1')
        ifTrue: [selectors add: aMessageNode selector key].
    (aMessageNode receiver isMessageNode
        and: [aMessageNode receiver selector key = #class]
        and: [aMessageNode receiver receiver name = 't1'])
        ifTrue: [classSelectors add: aMessageNode selector key].
    ^ super visitMessageNode: aMessageNode.

    ^ selectors.

    ^ classSelectors.

and invoked by our TheoryTyper (which could be called a DuckFinder…)

messagesSentToDatum: aUnaryCompiledMethod
    "Answer a pair of Sets of all the message selectors sent by this method to its
    argument. The first Set contains messages sent to the argument, and the
    second contains messages sent to the argument's class."

    | collector |
    collector := SenderToArgCollector new
        visitMethodNode: (Decompiler new
            decompile: aUnaryCompiledMethod selector
            in: aUnaryCompiledMethod methodClass).

    ^ {collector selectors. collector classSelectors}.

Let’s try out our new toy. To recap, we wish to write a theory and have the system automatically find the right types of things to test the theory. So let’s try the “monadic laws”:

Object subclass: #TheoryTyper
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'SqueakCheck-SUnit'

monadsObeyLeftIdentity: m
        assert: (m class return: m value) >>= [:t | m class return: t]
        equals: ([:t | m class return: t] value: m value)

monadsObeyRightIdentity: m
    self assert: m equals: (m >>= [:a | m class return: a])

And lo! our duck-finder says, with the Maybe and Either monads loaded:

TheoryTyper new typeOfDatum: (MonadTheories >> #monadsObeyLeftIdentity:)
"=> a Set(Maybe Either)"

The keen-eyed will notice a law missing from the above – the associativity law for monads. Given monadic blocks f and g – that is, unary blocks that take some value and return a value wrapped up in whatever monad you’re using – we can express this law as

monadsObeyAssociativity: m
| f g |

    assert: ((m >>= f) >>= g)
    equals: (m >>= [:x | (f value: x) >>= g])

but from where do we get f and g? Our naive duck-finding fails: we would need to extend our “type inference”. One possible (and fairly ugly) solution is to make m‘s class responsible through helpers: add sampleBlockF and sampleBlockG messages to the protocol and we could write:

monadsObeyAssociativity: m
| f g |

f := m class sampleBlockF.
g := m class sampleBlockG.
    assert: ((m >>= f) >>= g)
    equals: (m >>= [:x | (f value: x) >>= g])

Also, typing m is a bit harder than in the previous examples:

TheoryTyper new typeOfDatum:
    (MonadTheories >> #monadsObeyAssociativity:)
"=> a Set(Either Nothing Maybe Left Right Just)"

because all we have to work with is the send of >>=. Arguably it would be clearer to return a Set(Either Maybe) because the other classes are subclasses of these two.

However, despite the limitations of this technique, one can express theories in a nicely modular way. The monadic laws will simply run against any monadic classes (that is, any classes that understand #>>= on the instance side and return: on the class side) present in the image.

Frank Shearar

8 + = thirteen

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us