Clojure 1.2 has been released and Leiningen has moved up to 1.3.0 (or was it briefly 1.2.0) which supports the latest version – so I decided to spend a spare afternoon playing with one of the new features of Clojure.
One of the interesting new features is datatypes which are a replacement for the structmap feature from earlier versions of Clojure. Datatypes and structmaps are directly equivalent to maps.Â Clojure provides an implementation of relational algebra that works on its maps in the clojure.set namespace, I expected that these functions should work with datatypes but I decided to check for myself.
I specifically wanted to check how close you an approach to an ideal relational system as defined by C. J Date in his books – Database in Depth and Databases, Types, and the Relational Model: The Third Manifesto. Date strongly believes that databases should be built on relational algebra and that none of the current relational databases achieve that goal. Database in Depth is a very readable and concise introduction to relational algebra and is well worth reading.
clojure.set contains six functions that correspond to the six fundamental operations in relational algebra – union, difference, rename, selection, projection and join. The first two, union and difference, are simple set operations that will definitely work on sets of datatypes. Let us try the other four operators on some datatypes, I’ll base my examples on a cut-down set of relations used by Date describing suppliers, parts and shipments.
A clojure datatype is created using the defrecord function. We will a datatype for a supplier, a part and a shipment:
We can create a supplier like this:
We can then define our three relations like this:
Now let us try the first relational operation rename:
So this results in a new map based on the parts relation with the number and city fields renamed to id and location respectively.
Now let us try a select operation:
Note in this case the set that has been returned contains the datatype defined earlier not a simple map.
Now let us try the project operation:
Finally let us join the parts and shipments relations:
Note that the results here is a set of the datatype Part with the additional joined fields held as additional fields on Part.
So this seems to work perfectly, it is also transitive, for example:
This will produce the name of the suppliers that have shipped parts and are located in Paris.
So all is good with relational algebra with datatypes in Clojure! Not quite, C J Date is a bit more picky than that, he really doesn’t like null values, his ideal system has no place for null values. Clojure has a null value, nil, and any value can be nil. So I can create a record in which all values are nil, for example:
Oh no – there are unknown values! We could create a factory function that ensured that you can’t pass in any nils but you can still do this:
So nil can still worm its way in to the datatype and we’ve introduced unknown values into our relational algebra which we probably don’t want to do. So more work is needed to satisfy C J Date. However, the core operations from clojure.set have done remarkably well and it could easily sit in the core of a relational system with some wrapping to keep the evil nil at bay.