When writing tests for an application which involves a database, one of the first things you need to do is ensure sure that all of database integration tests start from a known state. Now, understandably, there are already libraries that can do this for us, such as theÂ Database CleanerÂ gem. However, when we started the project, I didn’t really see the value in importing a ~1000 line library when about 10 or so lines of Ruby would do the trick, and result in a helper which was quickly comprehensible without having to hunt down the defining Gem.
Preferences class provides a common place for all parts of a Squeak Smalltalk image to register their switches: Which update stream do we want to follow? What colour do we want our Browsers? Do we allow assignments to block parameters? Do we allow underscores in selector names? Preferences range from low level things that affect the language’s grammar, all the way to trivial things.
In the old days we would create a preference by adding a getter/setter pair that would expose a key-value pair in a dictionary. That sounds great. It’s certainly simple. It’s only when a system evolves organically over decades that we realise the trap: such a dictionary is a giant chain, coupling together otherwise independent packages that simply wish to store something configurable. Even though we can store the preference accessors within each package, the packages still share the backing dictionary. It’s a recipe for trouble. Let’s see how we can untangle the chain.
When playing with a new bit of language, it can be helpful to restrict the problem space to an old, well understood algorithm. For me at least, learning one thing at a time is easier! For this post, It’ll be prime sieves, and I’ll be exploring clojure reducers.
A quick recap, the sieve of eratosthenes is a not-maximally-non-optimal way of finding primes. It’s usually expressed as follows:
To find primes below n:
generate a list of n integers greater than 1
while the list is not empty:
take the head of the list and:
add it to the output
remove all numbers evenly divisible by it from the list
In clojure, something like:
([n] (sieve  (range 2 n)))
(if-let [prime (first xs)]
(recur (conj primes prime) (remove #(zero? (mod % prime)) xs))
;= [2 3 5 7]
Which is fine, but I’d like it lazy so I only pay for what I use, and I can use as much as I’m willing to pay for. Let’s look at lazy sequences. Luckily for us, there is an example of exactly this on the lazy-seq documentation, which we slightly modify like so:
(defn lazy-sieve [s]
(cons (first s)
(lazy-seq (lazy-sieve (remove #(zero? (mod % (first s))) (rest s))))))
(defn primes 
(lazy-seq (lazy-sieve (iterate inc 2))))
(take 5 (primes))
;= (2 3 5 7)
So now we have a nice generic source of primes that grows only as we take more. But is there another way?
A few months ago Rich Hickey introduced reducers. By turning the concept of ‘reducing’ inside out the new framework allows a parallel reduce (fold) in some circumstances. Which doesn’t apply here. But let’s see if we can build a different form of sieve using the new framework. First a quick overview (cribbing from the original blog post):
Collections are now ‘reducible’, in that they implement a reduce protocol. Filter, map, etc are implemented as functions that can be applied by a reducible to itself to return another reducible, but lazily, and possibly in parallel. So in the example below we have a reducible (a vector), that maps inc to itself to return a reducible that is then wrapped with a filter on even? which returns a further reducible, that reduce then collects with +.
(require '[clojure.core.reducers :as r])
We’ll be referring to r here and there – just remember it’s the clojure.core.reducers namespace
(reduce + (r/filter even? (r/map inc [1 1 1 2])))
These are composable, so we can build ‘recipes’.
;;red is a reducer awaiting a collection
(def red (comp (r/filter even?) (r/map inc)))
(reduce + (red [1 1 1 2]))
into uses reduce internally, so we can use it to build collections instead of reducing:
(into  (r/filter even? (r/map inc [1 1 1 2])))
;= [2 2 2]
So here’s the core of ‘reducer’, which “Given a reducible collection, and a transformation function xf, returns a reducible collection, where any supplied reducing fn will be transformed by xf. xf is a function of reducing fn to reducing fn.”
(coll-reduce [_ f1 init]
(clojure.core.protocols/coll-reduce coll (xf f1) init)))))
And we can then use that to implement mapping as so:
(defn mapping [f]
(fn [result input]
(f1 result (f input)))))
(defn rmap [f coll]
(reducer coll (mapping f)))
(reduce + 0 (rmap inc [1 2 3 4]))
Fine. So what about sieves? One thought is we could build up a list of composed filters, built as new primes are found (see the lazy-seq example above). But there’s no obvious place to do the building, as applying the reducing functions is left to the reducible implementation. Another possibility is to introduce a new type of reducing function, the ‘progressive-filter’, which keeps track of past finds and can filter against them.
(defn prog-filter [f]
(let [flt (atom )]
(fn [result input]
(if (not-any? #(f input %) @flt)
(swap! flt conj input)
(f1 result input))
(defn progressive-filter [f coll]
(reducer coll (prog-filter f)))
And we then reduce with a filtering function that is a function of the current candidate and one of the list of found primes (see the #(f input %) bit above)
(into  (progressive-filter #(zero? (mod %1 %2)) (range 2 10))
;= [2 3 5 7]
It’s nicely lazy, so we can use iterate to generate integers, and take only a few (r/take, as it’s operating on a reducer):
(into  (r/take 5 (progressive-filter #(zero? (mod %1 %2)) (iterate inc 2))))
;= [2 3 5 7 11]
(def primes (progressive-filter #(zero? (mod %1 %2)) (iterate inc 2)))
(into  (r/take 5 primes))
;= [2 3 5 7 11]
You get the idea.
Package dependencies appear in many ways, some more surprising than others. Let’s see what we’ve recently dug up in the trenches.
Recently I’ve started playing with Haskell a little more seriously, and I’ve been toying with the idea of using it to calculate approximated percentiles over streams of numerical data, as found in the histograms from Coda Hale’s well known metrics library. The first step in this process is picking out a representative sample of the input data, as described in the paper Random Sampling with a Reservoir. But in the course of writing the code, I learnt a couple of things that seemed to be worth sharing.
We all know GitHub is a fantastic way to collaborate on software. It has a fairly basic issue tracker, but it serves most purposes well enough. But if your code base is spread across several repositories, it can be difficult to know how you’re doing, from a management perspective. Octoherder can help with that.
NuGet is a package manager for the .NET world. At its core is an API that will take a set of assemblies that you are interested in and download any that are new to a specified folder. There is then a Visual Studio plug-in that uses a convention-based approach to adding new packages to your project. It creates a folder called “packages” in your solution, and a packages.config file in each project that makes use of NuGet. When you use the GUI to add a new package to a project, it updates the config file, and then pulls down the binaries and adds them to the project.
And of course, if the binaries are missing at build time it looks at that config file and pulls them down again. Wait, it doesn’t do that? Not only does it not do that, it’s not at all clear (to me at least) how to even trigger a refresh of the binaries manually.
This is because using NuGet and then committing the binaries to your source control is the developers’ recommended way of using NuGet. Which is Not Good.
Anyway, it’s an easy fix. Add this line to the “Pre-build event command line” for each project that makes use of NuGet. (Oh, and make sure that NuGet is on your PATH)
And you’re away!
The “Pre-build event command line” option can be found in project properties in the “Build Events” tab.
â€¦ because “clicking a button” should not exist in your domain language.
When we write tests, we want to be able to write them in a way which is relevant to the problem the application (or sub-component) solves, rather than how it is implemented. So, if you get an urge to reimplement natural numbers, then you want to avoid testing them in terms of the objects under the hood for several reasons. A big one being that the tests will be coupled to the implementation, meaning that they will likely break if you change how the code works.