technology from back to front

diff3, merging, and distributed version control

Yesterday I presented my work on Javascript diff, diff3, merging and version control at the Osmosoft Open Source Show ‘n Tell. (Previous posts about this stuff: here and here.)
The slides for the talk are here. They’re a work-in-progress – as I think of things, I’ll continue to update them.

To summarise: I’ve used the diff3 I built in May to make a simple Javascript distributed version-control system that manages a collection of JSON structures. It supports named branches, merging, and import/export of revisions. So far, there’s no network synchronisation protocol, although it’d be easy to build a simple one using the rev import/export feature and XMLHttpRequest, and the storage format and repository representation is brutally naive (and because it doesn’t yet delta-compress historical versions of files, it is a bit wasteful of memory).

You can try out a few browser-based demos of the features of the diff and DVCS libraries:

* a demo of diff, comm, and patch functionality.
* a demo of three-way merge and conflict-handling functionality.
* a demo of a Javascript DVCS, a bit like Mercurial, that manages a collection of JSON objects (presenting them in a file-like way, for the purposes of the demo).

The code is available using Mercurial by hg clone (or by simply browsing to that URL and exploring from there). It’s quite small and (I hope) easily understood – at the time of writing,

* the diff/diff3 code and support utilities are ~310 lines; and
* the DVCS code is ~370 lines.

The core interfaces, algorithms and internal structures of the DVCS code seem quite usable to me. In order to get to an efficient DVCS from here, the issues of storage and network formats will have to be addressed. Fortunately, storage and network formats are only about efficiency, not about features or correctness, and so they can be addressed separately from the core system. It will also eventually be necessary to revisit the naive LCA-computation code I’ve written, which is used to select an ancestor for use in a merge.

The code is split into a few different files:

* The sources for the diff and diff3 demos and the DVCS demo. In the latter, check out the definition of presets.preset1 for an example of how to use the DVCS, and presets.ambiguousLCA for an example of the repository format and the use of the revision import feature.
* The diff and diff3 code itself.
Graph utilities (for computing LCA etc)
* The DVCS and pseudo-file-system code.
* The repository history-graph-drawing code and a python script for drawing the little tile images used in rendering a repository history graph.

  1. ((I accidentally deleted a bunch of comments I didn’t mean to delete today, so I’m having to repost them manually:))

    Elliot Murphy wrote:

    Wow, this is very cool. Thanks for posting it. What are you using this for?

  2. ((I accidentally deleted a bunch of comments I didn’t mean to delete today, so I’m having to repost them manually:))

    FND wrote:

    Very cool indeed!
    I’m looking forward to seeing this implemented as a TiddlyWiki plugin and/or vertical.

  3. ((I accidentally deleted a bunch of comments I didn’t mean to delete today, so I’m having to repost them manually:))

    tonyg wrote:

    The idea is to see if it can be put into TiddlyWiki somehow – but there are plenty of other uses for a decentralised JSON synchronisation mechanism: bookmarks, calendar entries, contact lists, …


9 + four =

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us