technology from back to front

Archive for the ‘Version control’ Category

Automating pre-deployment sanity checks with Grunt

Grunt is a great tool for building, running and deploying ‘Single Page Apps’. I have a single grunt command to build and deploy to S3 for production, but recently I added some extra functionality to make deployment safer and even easier:

  • Abort if you are not on master branch
  • Abort if there are any uncommitted local changes
  • Abort if not up to date with the origin repo
  • Create a file revision.txt containing the deployed git revision hash, so we can GET it from the server and be sure of which revision is live
  • Automatically create a tag with the date and time.

I found a few existing pieces to implement some of these, but not all of them, and I ended up with a set of custom Grunt tasks, which I present here in the hope that they are useful to others. They could perhaps be packaged up into a Grunt plugin.

With no further ado, here is the stripped down Gruntfile, just showing the parts relevant to this post, though the deploy-prod task definition leaves in the other task names for context in the overall flow.

module.exports = function(grunt) {

  // Load all grunt tasks matching the `grunt-*` pattern

    // Lots of other Grunty things
    // ...

    // Executing the 'gitinfo' command populates grunt.config.gitinfo with useful git information
    // (see for details) plus results of our custom git commands.
    gitinfo: {
      commands: {
        'status': ['status', '--porcelain'],
        'origin-SHA': ['rev-parse', '--verify', 'origin']

    gittag: {
      prod: {
        options: {
          tag: 'prod-<%="ddmmyy-HHMM") %>'

    shell: {
      gitfetch: {
        command: 'git fetch'
      saverevision: {
        // Save the current git revision to a file that we can GET from the server, so we can
        // be sure exactly which version is live.
        command: 'echo <%= gitinfo.local.branch.current.SHA %> > revision.txt',
        options: {
          execOptions: {
            cwd: 'dist'

  grunt.registerTask('check-branch', 'Check we are on required git branch', function(requiredBranch) {

    if (arguments.length === 0) {
      requiredBranch = 'master';

    var currentBranch = grunt.config('');

    if (currentBranch !== requiredBranch) {
      grunt.log.error('Current branch is ' + currentBranch + ' - need to be on ' + requiredBranch);
      return false;

  grunt.registerTask('check-no-local-changes', 'Check there are no uncommitted changes', function() {

    var status = grunt.config('gitinfo.status');

    if (status != '') {
      grunt.log.error('There are uncommitted local modifications.');
      return false;

  grunt.registerTask('check-up-to-date', 'Check code is up to date with remote repo', function() {

    var localSha = grunt.config('gitinfo.local.branch.current.SHA');
    var originSha = grunt.config('gitinfo.origin-SHA');

    if (localSha != originSha) {
      grunt.log.error('There are changes in the origin repo that you don\'t have.');
      return false;

  // Some of these tasks are of course ommitted above, to keep the code sample focussed.
  grunt.registerTask('deploy-prod', ['build','prod-deploy-checks','gittag:prod','aws_s3:prod']);

  grunt.registerTask('prod-deploy-checks', ['gitinfo','check-branch:master','check-no-local-changes','shell:gitfetch','check-up-to-date']);

We rely on a few node modules:

  • grunt-git which provides canned tasks for performing a few common git activities. We use it for tagging here.
  • grunt-gitinfo which sets up a config hash with handy data from git, and allows adding custom items easily. This helps us to query the current state of things.
  • grunt-gitshell which lets us run arbitrary command line tasks. We use it to git fetch (not supported by grunt-git, though we could probably have abused gitinfo to do it) and to save the current revision to file. I hope that the command I use for that is cross-platform, even to Windows, but it’s only tested on Mac so far.

Hence I ended up with the following added to package.json:

    "grunt-git": "~0.2.14",
    "grunt-gitinfo": "~0.1.6",
    "grunt-shell": "~0.7.0"
Sam Carr

Enhancing peer review through GitHub

You love GitHub. Of course you do. You love peer review. You especially love sending a pull request back asking for nits to be picked. So when your submitter claims to have addressed your concerns, how do you check? You could walk the commits. You could diff the entire pull request against master. If only you could diff the HEAD of the pull request against the original state of the pull request, letting you check just the new set of commits…

With github-differ you can!

Simply add this tiny extension[1] to your Chrome, and it will decorate each commit in GitHub’s Commits tab. Pick any two commits, and the extension will redirect you to a page showing the comparison of those two commits! Job done!

[1]The JavaScript involved is so small that it should be trivial to port this to FireFox’s GreaseMonkey framework.

Frank Shearar

Continuous Integration for Github Pull Requests with Teamcity

Most developers with an interest in open source software these days have seen the Github interface for handling pull requests, and relatedly, Travis CI’s support for pull requests. And so we thought it’d be useful to have something similar for our internal CI system.

Read more…

Ceri Storey

mercurial-server 0.8 released

mercurial-server home page

mercurial-server gives your developers remote read/write access to centralized Mercurial repositories using SSH public key authentication; it provides convenient and fine-grained key management and access control.

Read more…

Paul Crowley

Mix and match version control

LShift’s standard version control platform these days is Mercurial, but just before we adopted it, I started a project using Trac and Subversion, mostly because that’s what Trac does out of the box.

Later, we branched the project to add a large new project, and during that branch we converted from using ant to Maven and modularised the project, resulting in a lot of moved files. This made what we were doing on the branch a lot easier, but left us with a merge that subversion wasn’t capable of, even though we had used svn mv to move all the files.

What was capable of the merge was Mercurial. I imported the whole subversion repository using hg convert. See the convert extension documentation. It works exactly as described, but make sure you have 1.0.1 or later – I had problems with earlier versions.

The merge went reasonably well, so I was left with a merged version in a Mercurial repository. I was going to switch to using Mercurial, and its Trac integration, when I discovered that couldn’t cope with multiple repositories. The Trac instance was managing several different source projects, which would have to go into several mercurial repositories, which I couldn’t merge together in any satisfactory way.

There are several projects around to address this (I’ll probably cover them in another post), none of which are ready for production yet. I decided the most expedient thing would be to try and generate a patch for my merge, and apply it to the subversion repository.

A conventional patch would lose the version history of all the moved files, so I decided a git diff would do the job. You can certainly, with some patience, get git-svn to do this, and understand what it was doing. Lacking that patience, I wrote a script to do the job. It parses the git diff and deals with any directory creation needed, calls to svn mv, svn add, and svn rm as required by the diff. It actually turns out to be a bit more work than I was expecting, so I’ve published it here.


Smalltalk vs. Javascript; Diff and Diff3 for Squeak Smalltalk

Many of my recent posts here have discussed the diff and diff3 code I wrote in Javascript. A couple of weekends ago I sat down and translated the code into Squeak Smalltalk. The experience of writing the “same code” for the two different environments let me compare them fairly directly.

To sum up, Smalltalk was much more pleasant than working with Javascript, and produced higher-quality code (in my opinion) in less time. It was nice to be reminded that there are some programming languages and environments that are actually pleasant to use.

The biggest win was Smalltalk’s collection objects. Where stock Javascript limits you to the non-polymorphic

for (var index = 0; index &lt; someArray.length; index++) {
  var item = someArray[index];
  /* do something with item, and/or index */

Smalltalk permits

someCollection do: [:item | "do something with item"].

or, alternatively

someCollection withIndexDo:
    [:item :index | "do something with item and index"].

Smalltalk collections are properly object-oriented, meaning that the code above is fully polymorphic. The Javascript equivalent only works with the built-in, not-even-proper-object Arrays.

Of course, I could use one of the many, many, many, many Javascript support libraries that are out there; the nice thing about Smalltalk is that I don’t have to find and configure an ill-fitting third-party bolt-on collections library, and that because the standard library is simple yet rich, I don’t have to worry about potential incompatibilities between third-party libraries, such as can occur in Javascript if you’re mixing and matching code from several sources.

Other points that occurred to me as I was working:

  • Smalltalk has simple, sane syntax; Javascript… doesn’t. (The number of times I get caught out by the semantics of this alone…!)
  • Smalltalk has simple, sane scoping rules; Javascript doesn’t. (O, for lexical scope!)
  • Smalltalk’s uniform, integrated development tools (including automated refactorings and an excellent object explorer) helped keep the code clean and object-oriented.
  • The built-in SUnit test runner let me develop unit tests alongside the code.

The end result of a couple of hours’ hacking is an implementation of Hunt-McIlroy text diff (that works over arbitrary SequenceableCollections, and has room for alternative diff implementations) and a diff3 merge engine, with a few unit tests. You can read a fileout of the code, or use Monticello to load the DiffMerge module from my public Monticello repository. [Update: Use the DiffMerge Monticello repository on SqueakSource.]

If Monticello didn’t already exist, it’d be a very straightforward matter indeed to build a DVCS for Smalltalk from here. I wonder if Spoon could use something along these lines?

It also occurred to me it’d be a great thing to use OMeta/JS to support the use of

<script type="text/smalltalk">"<![CDATA["
  (document getElementById: 'someId') innerHTML: '<p>Hello, world!</p>'

by compiling it to Javascript at load-time (or off-line). Smalltalk would make a much better language for AJAX client-side programming.


Adding distributed version control to TiddlyWiki

After my talk on Javascript DVCS at the Osmosoft Open Source Show’n’tell, I went to visit Osmosoft, the developers of TiddlyWiki, to talk about giving TiddlyWiki some DVCS-like abilities. Martin Budden and I sat down and built a couple of prototypes: one where each tiddler is versioned every time it is edited, and one where versions are snapshots of the entire wiki, and are created each time the whole wiki is saved to disk.

Regular DVCS SynchroTiddly
Repository The html file contains everything
File within repository Tiddler within wiki
Commit a revision Save the wiki to disk
Save a text file Edit a tiddler
Push/pull synchronisation Import from other file

If you have Firefox (it doesn’t work with other browsers yet!) you can experiment with an alpha-quality DVCS-enabled TiddlyWiki here. Take a look at the “Versions” tab, in the control panel at the right-hand-side of the page. You’ll have to download it to your local hard disk if you want to save any changes.

It’s still a prototype, a work-in-progress: the user interface for version management is clunky, it’s not cross-browser, there are issues with shadow tiddlers, and I’d like to experiment with a slightly different factoring of the repository format, but it’s good enough to get a feel for the kinds of things you might try with a DVCS-enabled TiddlyWiki.

Despite its prototypical status, it can synchronize content between different instances of itself. For example, you can have a copy of a SynchroTiddly on your laptop, email it to someone else or share it via HTTP, and import and merge their changes when they make their modified copy visible via an HTTP server or email it back to you.

I’ve been documenting it in the wiki itself — if anyone tries it out, please feel free to contribute more documentation; you could even make your altered wiki instance available via public HTTP so I can import and merge your changes back in.


Mercurial merge technique

We’re using Mercurial here at LShift for much of our development work, now, and we’re finding it a great tool. We make heavy use of branches (“branch per bug”) for many projects, and this is also a pretty smooth experience. One issue that has come up is policy regarding merging the trunk (“default”) into any long-lived feature/bug branches: should you do it, or should you not?

My vote is that you should merge default into long-lived branches fairly regularly; otherwise, you have a big-bang, all-at-once nightmare of a merge looming ahead of you. If you do merge frequently, though, there’s one subtlety to be aware of: hg diff is not history aware, so in order to get an accurate, focussed picture of all the changes that have been made on your long-lived branch, you need to do one of two things:

* either, merge default into your long-lived branch right before you merge the long-lived branch back into default, and run hg diff after that’s complete; or
* (recommended) do a throw-away test-merge of the long-lived branch into default directly.

Imagine a history like this:

 (2) (3)
  |   |
   \ /

… where (1) is an ancestral revision, (2) is the default branch, and (3) is the long-lived branch – let’s call it “foo”.

Given this history, running hg update -C default (to make the working copy be the default branch, i.e. revision (2)) followed by hg diff foo will give you a misleading diff – one that undoes the changes (1) to (2) before doing the changes from (1) to (3). This is almost certainly not what you want!

Instead, run a test merge, by hg update -C default followed by hg merge foo and then plain old hg diff. Note that this modifies your working copy! You will need to revert (by hg update -C default) if you decide the merge isn’t ready to be committed.

The output of hg diff after the hg merge shows a history-aware summary of the changes that the merge would introduce to your checked-out branch. It’s this history-awareness (“three-way merge”) that makes it so much superior to the history-unaware simple diff (“two-way merge”).


diff3, merging, and distributed version control

Yesterday I presented my work on Javascript diff, diff3, merging and version control at the Osmosoft Open Source Show ‘n Tell. (Previous posts about this stuff: here and here.)
The slides for the talk are here. They’re a work-in-progress – as I think of things, I’ll continue to update them.

To summarise: I’ve used the diff3 I built in May to make a simple Javascript distributed version-control system that manages a collection of JSON structures. It supports named branches, merging, and import/export of revisions. So far, there’s no network synchronisation protocol, although it’d be easy to build a simple one using the rev import/export feature and XMLHttpRequest, and the storage format and repository representation is brutally naive (and because it doesn’t yet delta-compress historical versions of files, it is a bit wasteful of memory).

You can try out a few browser-based demos of the features of the diff and DVCS libraries:

* a demo of diff, comm, and patch functionality.
* a demo of three-way merge and conflict-handling functionality.
* a demo of a Javascript DVCS, a bit like Mercurial, that manages a collection of JSON objects (presenting them in a file-like way, for the purposes of the demo).

The code is available using Mercurial by hg clone (or by simply browsing to that URL and exploring from there). It’s quite small and (I hope) easily understood – at the time of writing,

* the diff/diff3 code and support utilities are ~310 lines; and
* the DVCS code is ~370 lines.

The core interfaces, algorithms and internal structures of the DVCS code seem quite usable to me. In order to get to an efficient DVCS from here, the issues of storage and network formats will have to be addressed. Fortunately, storage and network formats are only about efficiency, not about features or correctness, and so they can be addressed separately from the core system. It will also eventually be necessary to revisit the naive LCA-computation code I’ve written, which is used to select an ancestor for use in a merge.

The code is split into a few different files:

* The sources for the diff and diff3 demos and the DVCS demo. In the latter, check out the definition of presets.preset1 for an example of how to use the DVCS, and presets.ambiguousLCA for an example of the repository format and the use of the revision import feature.
* The diff and diff3 code itself.
Graph utilities (for computing LCA etc)
* The DVCS and pseudo-file-system code.
* The repository history-graph-drawing code and a python script for drawing the little tile images used in rendering a repository history graph.


Diff for Javascript, revisited

Last weekend I finally revisited the diff-in-javascript code I’d written a couple of years back, adding (very simple) patch-like and diff3-like functionality.

On the way, not only did I discover Khanna, Kunal and Pierce’s excellent paper “A Formal Investigation of Diff3“, but I found, the revision-control wiki, which I’m just starting to get my teeth into. I’m looking forward to learning more about merge algorithms.

The code I wrote last weekend is available: just download diff.js. The tools included:

* Diff.diff_comm – works like a simple Unix comm(1)
* Diff.diff_patch – works like a simple Unix diff(1)
* Diff.patch – works like a (very) simple Unix patch(1) (it’s not a patch on Wall’s patch)
* Diff.diff3_merge – works like a couple of the variations on GNU’s diff3(1)

Let’s try out a few examples. First, we need to set up a few “files” that we’ll run through the tools. The tools work with files represented as arrays of strings – each string representing one line in the file – so we’ll define three such arrays:

var base = "the quick brown fox jumped over a dog".split(/\s+/);
var derived1 = "the quick fox jumps over some lazy dog".split(/\s+/);
var derived2 =
  "the quick brown fox jumps over some record dog".split(/\s+/);

Examining base shows us that we have the right format:

js> uneval(base);
["the", "quick", "brown", "fox", "jumped", "over", "a", "dog"]

First, let’s run the subroutine I originally wrote back in 2006:

js> uneval(Diff.diff_comm(base, derived1));
[{common:["the", "quick"]},
 {file1:["brown"], file2:[]},
 {file1:["jumped"], file2:["jumps"]},
 {file1:["a"], file2:["some", "lazy"]},

The result is an analysis of the chunks in the two files that are the same, and those that are different. The same results can be presented in a terser differential format, similar to Unix diff:

js> uneval(Diff.diff_patch(base, derived1));
[{file1:{offset:2, length:1, chunk:["brown"]},
  file2:{offset:2, length:0, chunk:[]}},

 {file1:{offset:4, length:1, chunk:["jumped"]},
  file2:{offset:3, length:1, chunk:["jumps"]}},

 {file1:{offset:6, length:1, chunk:["a"]},
  file2:{offset:5, length:2, chunk:["some", "lazy"]}}]

Note the similarity of the results to the line-numbers and line-counts that diff(1) outputs by default.

The next example takes a diff-like patch, and uses it to reconstruct a derived file from the base file:

js> uneval(Diff.patch(base, Diff.diff_patch(base, derived1)));
["the", "quick", "fox", "jumps", "over", "some", "lazy", "dog"]

Finally, we attempt a diff3 style merge of the changes made in derived1 and derived2, hopefully ending up with a file without conflicts:

js> uneval(Diff.diff3_merge(derived1, base, derived2, true));
[{ok:["the", "quick", "fox", "jumps", "over"]},
 {conflict:{a:["some", "lazy"], aIndex:5,
            o:["a"], oIndex:6,
            b:["some", "record"], bIndex:6}},

We see that the result isn’t what we’d hoped for: while two regions are unproblematic, and merge cleanly, the algorithm couldn’t decide how to merge one part of the inputs, and has left the conflict details in the output for the user to resolve.

There are algorithms that give different (and usually better) results than diff3 – the paper I mentioned above explains some of the problems with diff3, and I’m looking forward to reading about what alternatives others have come up with at




You are currently browsing the archives for the Version control category.



2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us