technology from back to front

Web: Embedded video and progressive download: A Quiz

I will provide you with two video files, video1.flv and video2.wmv, you need to embed them on the page and ensure that they use progressive download. Both video files are greater in size than 1GB so it will be obvious whether they are playing before they have completely downloaded. You will need to use the flash video player that I have provided for the flash video. Which one of the HTML snippets shown below should you use?

Snippet A

<object type="application/x-shockwave-flash" data="/player.swf" >
  <param name="movie" value="/player.swf"/>
  <param name="FlashVars" value="flv=/video1.flv"/>
</object>

<object type="video/x-ms-wmv"> <param name="FileName" value="/video2.wmv"/> </object>

Snippet B

<object type="application/x-shockwave-flash" data="/player.swf" >
  <param name="movie" value="/player.swf"/>
  <param name="FlashVars" value="flv=http://myserver.lshift.net/video1.flv"/>
</object>

<object type="video/x-ms-wmv"> <param name="FileName" value="http://myserver.lshift.net/video2.wmv"/> </object>

(more…)

by
tim
on
13/02/11

Web: A Custom ASP.Net Navigation Component for EpiServer CMS

LShift have used the EpiServer CMS on several customer projects and it generally does most things you would want to do with a CMS in a simple way. EpiServer is a .Net based CMS and if you understand ASP.NET templated pages and templated controls it is very straightforward with a minimal learning curve.

One challenge I faced on a recent project was to implement a particular HTML navigation design using EpiServer. The HTML design called for the navigation to be rendered as nested HTML lists with the current section of the site annotated with a particular class.

For example if you were looking at “Tasty Fish” in the “Cat Food” section of the site the HTML should look something like this:

<ul>
    <li>Dog Food
        <ul>
            <li>Meaty Bones</li>
        </ul>
    </li>
    <li class="selected">Cat Food
        <ul>
            <li>Tasty Fish</li>
        </ul>
    </li>
</ul>
On initial inspection the EpiServer CMS appears to have two controls that may help, the EpiServer:MenuList and the EpiServer:PageTree. I first attempted to use the EpiServer:MenuList, this allowed me to do this:
   <ul>
        <li>Dog Food</li>
        <li>Cat Food</li>
    </ul>
    <ul>
        <li class="selected">Tasty Fish</li>
    </ul>
This isn’t quite what the design required, the complete site navigation tree needed to be rendered since CSS was being used to show and hide menus in response to mouse rollovers.

So for attempt two I tried the EpiServer:PageTree component; this component is designed to render a whole tree of pages so it should be an appropriate solution. It is a very flexible component and provides lots of templates for customising the layout based upon the state of the tree. This is what I ended up with:

   <ul>
        <li>Dog Food
            <ul>
                <li>Meaty Bones</li>
            </ul>
        </li>
        <li>Cat Food
            <ul>
                <li class="selected">Tasty Fish</li>  <!-- OH NO THIS IS WRONG -->
            </ul>
        </li>
    </ul>
This was very close! However it didn’t meet the design requirement; the top level item that contained the current page needed to be tagged with the CSS class, not the item corresponding to the current page. There didn’t seem to be an easy way to achieve this with the EPiServer components.

I decided I probably need some type of custom control, I then proceeded to write three implementations of a navigation control moving from sinful generation of HTML in a code behind, through my own templated control until arriving at the obvious solution using the asp:ListView control and a simple code behind. This was a nice solution because it uses a standard ASP.NET component in a standard way, the complication of tagging the selected top level item could be hidden away in a small code behind, and the markup was completely under the control of the HTML developer.

The navigation section of the ASP page looked like this:

   <asp:ListView ID="Level1" runat="server" ItemPlaceHolderID="Level1Item">
        <LayoutTemplate>
            <ul><asp:PlaceHolder ID="Level1Item" runat="server"/></ul>
        </LayoutTemplate>
        <ItemTemplate>
            <li class='<%# ((Boolean)Eval("Selected")) ? "selected" : "" %>'><%# Eval("Name") %>
                <asp:ListView ID="Level2" runat="server" ItemPlaceHolderID="Level2Item">
                    <LayoutTemplate>
                        <ul><asp:PlaceHolder ID="Level2Item" runat="server"/></ul>
                    </LayoutTemplate>
                    <ItemTemplate>
                        <li><%# Eval("Name") %>
                    </ItemTemplate>
                </asp:ListView>
            </li>
        </ItemTemplate>
    </asp:ListView>
This is a straightforward usage of nested ListViews and ASP data binding expressions, all of the markup is visible and it can be explained to an HTML developer in a short amount of time. New navigation levels can be added in exactly the same way that the Level 2 navigation was added to the Level1 navigation. The ternary operator within the data binding expression,

class='&lt;%# ((Boolean)Eval("Selected")) ? "selected" : "" %&gt;'

, determines if the navigation item is selected, this is a standard mechanism for conditional rendering with ASP.NET data bound controls.

This was combined with a page behind like this:

   protected override void OnLoad(System.EventArgs e)
    {
        base.OnLoad(e);

    Level1.DataSource = BuildMenuItems();
    Level1.DataBind();
}

private List&lt;MenuItem&gt; BuildMenuItems()
{
    List&lt;MenuItem&gt; menuItems = new List&lt;MenuItem&gt;();

    PageData homePage = GetPage(PageReference.StartPage);
    foreach(PageData child in GetChildren(homePage.PageLink))
    {
        if(child.VisibleInMenu)
        {
            MenuItem item = CreateMenuItem(child, true);
            item.Selected = findPage(CurrentPage.PageGuid, child);
            menuItems.Add(item);
        }
    }

    return menuItems;
}

private MenuItem CreateMenuItem(PageData page, Boolean includeChildren)
{
    MenuItem item = new MenuItem(page.PageName);
    item.Url = page.LinkURL;

    if (includeChildren)
    {
        PageDataCollection children = GetChildren(page.PageLink);
        foreach (PageData child in children)
        {
            if (child.VisibleInMenu)
            {
                item.Children.Add(CreateMenuItem(child, true));
            }
        }
    }

    return item;
}

private Boolean findPage(Guid id, PageData parent)
{
    if (id == parent.PageGuid) return true;

    foreach (PageData page in GetChildren(parent.PageLink))
    {
        if (page.PageGuid == id)
        {
            return true;
        }
        if(findPage(id, page))
        {
            return true;
        }
    }

    return false;
}</pre>

With a helper class MenuItem defined like this:

public class MenuItem
{
    public MenuItem(String name)
    {
        this.Name = name;
    }

public String Name { get; set; }
public String Url { get; set; }
public Boolean Selected { get; set; }
private List&lt;MenuItem&gt; children = new List&lt;MenuItem&gt;();
public List&lt;MenuItem&gt; Children {
    get
    {
        return children;
    }
    set
    {
        children = value;
    }

}

}

The page behind creates MenuItem instances for each page in the navigation. The top level item gets tagged as selected only if the current page is one of its children. This is a reasonable amount of code to write but it was the smallest solution that solved the problem and made the HTML obvious and available for modification by HTML developers.

by
tim
on
29/11/09

Web: Untangling the BBC’s data feeds

Recently, Alan Ogilvie from A&Mi at the BBC announced that they were developing a “Feeds Hub”, and outlined their ambitions for it.

He also mentioned LShift, RabbitMQ and open source, and I would like to explain, from our point of view, what this project is and how we’re working with the BBC.

What is a “Feeds Hub”?

Alan describes the central problem they want to solve:

The number of new projects across the BBC starting to use feeds in creative ways is growing very quickly – just think of spaghetti… on a massive scale. So what do we do? What are the options? We could go down the route of gathering together a centralised ‘Feed Usage’ committee with members across the BBC, to ‘federate’ feeds so that they are all produced in the same way but, in practice, this never truly works and is likely to stifle creativity. Often it is quite difficult to convince people to work together when they have already experienced the freedom of doing what they want – often they are concerned that their projects will be delayed. Not all feeds sources that we use or want to use are under our control, things like Twitter, Flickr, blogs, etc. Federation will never solve all our problems anyway – for example, it can’t help when a source feed is turned off, it doesn’t monitor failures.

The idea is, then, is to bring the spaghetti under control; not by mandating things be done a certain way, but by overlaying a bunch of management and monitoring tools that would otherwise be ad-hoc or not exist.

We also want to enable people to discover, reuse and adapt existing feeds, rather than reinvent them. Again, not by enforcement, but by making it easier to do so than to not.

And we’re not just talking about RSS — there are (at the BBC and in general) many different protocols and formats flying about.

Technically-speaking, this adds up to a couple of pieces of kit: a platform for relaying feeds through, that supports routing, transformation and distribution by a number of different means; and, a user interface for discovering, creating, managing and monitoring these feeds.

How are LShift involved?

In short: LShift are developing the core technology, helping the BBC shepherd the various strands of the project along, and helping engage with developers to build the open source aspect of the project (about which more in a bit).

LShift are the progenitors of RabbitMQ, a message broker implementing AMQP. Over the last few years we’ve been thinking about and experimenting with different applications of messaging (and not just AMQP); for example, Rabbiter, which puts a Twitter-like spin on XMPP.

In the meantime, RabbitMQ itself has gained client libraries, gateways, adapters, and a smart, active community, to the point where it’s no longer just an AMQP message broker — it’s becoming more like a universal messaging adapter.

So we were very enthused when we heard that the BBC wanted a feeds hub, because it seemed to bring together lots of what we’d been thinking about abstractly, as well as new ideas and problems to solve, and give it all a concrete purpose.

When and how will it be open source?

We’re working on a prototype, and our plan is to make the source public as soon as it’s fit for consumption. We hope this will be in the next month.

In the meantime, I may talk about some of the core technical ideas, and our plans, here on our blog; and, of course, you can follow LShift on Twitter and the Radiolabs blog.

by
mikeb
on
08/05/09

Web: Reverse HTTP == Remote CGI

I’ve been working recently on Reverse HTTP, an approach to making HTTP easier to use as the distributed object system that it is. My work is similar to the work of Lentczner and Preston, but is independently invented and technically a bit different: one, I’m using plain vanilla HTTP as a transport, and two, I’m focussing a little more on the enrollment, registration, queueing and management aspects of the system. My draft spec is here (though as I’m still polishing, please excuse its roughness), and you can play with some demos or download and play with an implementation of the spec.

Comments welcome!

by
tonyg
on
08/03/09

Web: Streamlining HTTP

HTTP/1.1 is a lovely protocol. Text-based, sophisticated, flexible. It does tend toward the verbose though. What if we wanted to use HTTP’s semantics in a very high-speed messaging situation? How could we mitigate the overhead of all those headers?

Now, bandwidth is pretty cheap: cheap enough that for most applications the kind of approach I suggest below is ridiculously far over the top. Some situations, though, really do need a more efficient protocol: I’m thinking of people having to consume the OPRA feed, which is fast approaching 1 million messages per second (1, 2, 3). What if, in some bizarre situation, HTTP was the protocol used to deliver a full OPRA feed?

Being Stateful

Instead of having each HTTP request start with a clean slate after the previous request on a given connection has been processed, how about giving connections a memory?

Let’s invent a syntax for HTTP that is easy to translate back to regular HTTP syntax, but that avoids repeating ourselves quite so much.

Each line starts with an opcode and a colon. The rest of the line is interpreted depending on the opcode. Each opcode-line is terminated with CRLF.

V:HTTP/1.x                          Set HTTP version identifier.
B:/some/base/url                    Set base URL for requests.
M:GET                               Set method for requests.
<:somename                          Retrieve a named configuration
>:somename                          Give the current configuration a name
H:Header: value                     Set a header
-:/url/suffix                       Issue a bodyless request
+:/url/suffix 12345                 Issue a request with a body

Opcodes V, B, M and H are hopefully self-explanatory. I’ll explore < and > below. The opcodes - and + actually complete each request and tell the server to process the message.

Opcode - takes as its argument a URL fragment that gets appended to the base URL set by opcode B. Opcode + does the same, but also takes an ASCII Content-Length value, which tells the server to read that many bytes after the CRLF of the + line, and to use the bytes read as the entity body of the HTTP request.

Content-Length is a slightly weird header, more properly associated with the entity body than the headers proper, which is why it gets special treatment. (We could also come up with a syntax for indicating chunked transfer encoding for the entity body.)

As an example, let’s encode the following POST request:

POST /someurl HTTP/1.1
Host: relay.localhost.lshift.net:8000
Content-Type: text/plain
Accept-Encoding: identity
Content-Length: 13

hello world

Encoded, this becomes

V:HTTP/1.1
B:/someurl
M:POST
H:Host: relay.localhost.lshift.net:8000
H:Content-Type: text/plain
H:Accept-Encoding: identity
+: 13
hello world

Not an obvious improvement. However, consider issuing 100 copies of that same request on a single connection. With plain HTTP, all the headers are repeated; with our encoded HTTP, the only part that is repeated is:

+: 13
hello world

Instead of sending (151 * 100) = 15100 bytes, we now send 130 + (20 * 100) = 2130 bytes.

The scheme as described so far takes care of the unchanging parts of repeated HTTP requests; for the changing parts, such as Accept and Referer headers, we need to make use of the < and > opcodes. Before I get into that, though, let’s take a look at how the scheme so far might work in the case of OPRA.

Measuring OPRA

Each OPRA quote update is on average 66 bytes long, making for around 63MB/s of raw content.

Let’s imagine that each delivery appears as a separate HTTP request:

POST /receiver HTTP/1.1
Host: opra-receiver.example.com
Content-Type: application/x-opra-quote
Accept-Encoding: identity
Content-Length: 66

blablablablablablablablablablablablablablablablablablablablablabla

That’s 213 bytes long: an overhead of 220% over the raw message content.

Encoded using the stateful scheme above, the first request appears on the wire as

V:HTTP/1.1
B:/receiver
M:POST
H:Host: opra-receiver.example.com
H:Content-Type: application/x-opra-quote
H:Accept-Encoding: identity
+: 66
blablablablablablablablablablablablablablablablablablablablablabla

and subsequent requests as

+: 66
blablablablablablablablablablablablablablablablablablablablablabla

for an amortized per-request size of 73 bytes: a much less problematic overhead of 11%. In summary:

Encoding Bytes per message body Per-message overhead (bytes) Size increase over raw content Bandwidth at 1M msgs/sec
Plain HTTP 66 147 220% 203.1 MBy/s
Encoded HTTP 66 7 11% 69.6 MBy/s

Using plain HTTP, the feed doesn’t fit on a gigabit ethernet. Using our encoding scheme, it does.

Besides the savings in terms of bandwidth, the encoding scheme could also help with saving CPU. After processing the headers once, the results of the processing could be cached, avoiding unnecessary repetition of potentially expensive calculations such as routing, authentication, and authorisation.

Almost-identical requests

Above, I mentioned that some headers changed, while others stayed the same from request to request. The < and > opcodes are intended to deal with just this situation.

The > opcode stores the current state in a named register, and the < opcode loads the current state from a register. Headers that don’t change between requests are placed into a register, and each request loads from that register before setting its request-specific headers.

To illustrate, imagine the following two requests:

GET / HTTP/1.1
Host: www.example.com
Cookie: key=value
Accept: HTTP Accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

GET /style.css HTTP/1.1
Host: www.example.com
Cookie: key=value
Referer: http://www.example.com/
Accept: text/css,*/*;q=0.1

One possible encoding is:

V:HTTP/1.1
B:/
M:GET
H:Host: www.example.com
H:Cookie: key=value
>:config1
H:Accept: HTTP Accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
-:
<:config1
H:Referer: http://www.example.com/
H:Accept: text/css,*/*;q=0.1
-:style.css

By using <:config1, the second request reuses the stored settings for the method, base URL, HTTP version, and Host and Cookie headers.

It’ll never catch on, of course — and I don’t mean for it to

Most applications of HTTP do fine using ordinary HTTP syntax. I’m not suggesting changing HTTP, or trying to get an encoding scheme like this deployed in any browser or webserver at all. The point of the exercise is to consider how low one might make the bandwidth overheads of a text-based protocol like HTTP for the specific case of a high-speed messaging scenario.

In situations where the semantics of HTTP make sense, but the syntax is just too verbose, schemes like this one can be useful on a point-to-point link. There’s no need for global support for an alternative syntax, since people who are already forming very specific contracts with each other for the exchange of information can choose to use it, or not, on a case-by-case basis.

Instead of specifying a whole new transport protocol for high-speed links, people can reuse the considerable amount of work that’s gone into HTTP, without paying the bandwidth price.

Aside: AMQP 0-8 / 0-9

Just as a throwaway comparison, I computed the minimum possible overhead for sending a 66-byte message using AMQP 0-8 or 0-9. Using a single-letter queue name, “q“, the overhead is 69 bytes per message, or 105% of the message body. For our OPRA example at 1M messages per second, that works out at 128.7 megabytes per second, and we’re back over the limit of a single gigabit ethernet again. Interestingly, despite AMQP’s binary nature, its overhead is much higher than a simple syntactic rearrangement of a text-based protocol in this case.

Conclusion

We considered the overhead of using plain HTTP in a high-speed messaging scenario, and invented a simple alternative syntax for HTTP that drastically reduces the wasted bandwidth.

For the specific example of the OPRA feed, the computed bandwidth requirement of the experimental syntax is only 11% higher than the raw data itself — nearly 3 times less than ordinary HTTP.


Note: this is a local mirror of this.

by
tonyg
on
27/02/09

Web: Jeff Lindsay on Web Hooks

From Jason Salas’s interview with Jeff Lindsay, the guy who invented the term web hooks:

“For example, the Facebook Platform, although pretty complicated and full of their own technology, is still at the core based on web hooks. They call out to a user-defined external web application and integrate that with their application. That’s quite a radically different use of web hooks compared to the way people think of them in relation to XMPP.”

That’s an interesting point: while nothing is stopping XMPP from being used this way, it’s not how it is currently used. XMPP seems to be gaining some adoption for asynchronous or messaging-style tasks, but I haven’t seen much in the way of generalised RPC over XMPP yet. (Perhaps I’ve overlooked something obvious?) HTTP, on the other hand, is being used both for asynchronous operations (HTTP push, where the HTTP response has no body, and serves as an acknowledgement of receipt or completion) and for synchronous RPC-like operations (JSON-RPC, SOAP, CGI, ordinary static web pages).

Web hooks can be seen as an approach to making it easier for people to participate in the world of distributed objects that is HTTP — a worthy goal.

by
tonyg
on
26/02/09

Web: EvServer, Introduction: The tale of a forgotten feature

Long long time ago there was a WSGI spec. This document described a lot of interesting stuff. Between other very important paragraphs you could find a hidden gem:

[...] applications will usually return an iterator (often a generator-iterator) that produces the output in a block-by-block fashion. These blocks may be broken to coincide with mulitpart boundaries (for “server push”), or just before time-consuming tasks (such as reading another block of an on-disk file). [...]

It means that all WSGI conforming servers should be able to send multipart http responses. WSGI clock application theoretically could be written like that:
def clockdemo(environ, startresponse):
    startresponse("200 OK", [('Content-type','text/plain')])
    for i in range(100):
        yield "%s\n" % (datetime.datetime.now(),)
        time.sleep(1)
The problem is that way of programming just doesn’t work well. It’s not scalable, requires a lot of threads and can eat a lot of resources. That’s why the feature has been forgotten.

Until May 2008, when Christopher Stawarz reminded us this feature and proposed an enhancement to it. He suggested, that instead of blocking, like time.sleep(1), inside the code WSGI application should return a file descriptor to server. When an event happens on this descriptor, the WSGI app will be continued. Here’s equivalent of the previous code, but using the extension. With appropriate server this could be scalable and work as expected:
def clockdemo(environ, startresponse):
    startresponse("200 OK", [('Content-type','text/plain')])
    sd = socket.socket(socket.AFINET, socket.SOCKDGRAM)
    try:
        for i in range(100):
            yield environ['x-wsgiorg.fdevent.readable'](sd, 1.0)
            yield "%s\n" % (datetime.datetime.now(),)
    except GeneratorExit:
        pass
    sd.close()
So I created a server that supports it: EvServer the Asynchronous Python WSGI Server


Implementation

I did my best to implement the latest of the three versions of Chris proposal. The code is based on my hacked together implementation of a very similar project django-evserver, which was created way before the extension was invented and before I knew about the WSGI multipart feature.

EvServer is very small and lightweight , the core is about 1000 lines of Python code. Apparently, due to the fact that EvServer is using ctypes bindings to libevent, it’s quite fast.

I did a basic test to see how fast it is. The methodology is very dumb, I just measure the number of handled WSGI requests per second, so as a result I receive only the server speed. The difference is clearly visible:

Server
Fetches/sec
evserver 4254
spawning with threads 1237
spawning without threads 2200
cherrypy wsgi server 1700
  
 
Description
 
So what really EvServer is?
  • It’s yet another WSGI server.
  • It’s very low levelish, the WSGI application has control on almost every http header.
  • It’s great for building COMET applications.
  • It’s fast and lightweight.
  • It’s feature complete.
  • Internally it’s asynchronous.
  • It’s simple to use.
  • It’s 100% written in Python, though it uses libevent library, which is in C.
What EvServer is not?
  • Unfortunately, it’s not mature yet.
  • It’s Linux and Mac only.
  • It’s not fully blown, Apache-like web server.
  • Currently it’s Python 2.5 only.

Examples

Admittedly using raw WSGI for regular web applications is a bit inconvenient. Fortunately decent web frameworks support passing iterators from the web application down to the WSGI server, throughout all the framework. On my list of frameworks that support iterators you can find: Django and Web.py.

Django

Django 1.0 supports returning iterators from views. This is Django code for the clock example:

def djangoclock(request):
    def iterator():
        sd = socket.socket(socket.AFINET, socket.SOCKDGRAM)
        try:
            while True:
                yield request.environ['x-wsgiorg.fdevent.readable'](sd, 1.0)
                yield '%s\n' % (datetime.datetime.now(),)
        except GeneratorExit:
            pass
        sd.close()
    return HttpResponse(iterator(), mimetype="text/plain")
The problem is that this code is not going to work using the standard ./manage runserver development server. Fortunately, it’s very easy to integrate EvServer with Django, you only need to put that into settings.py:
INSTALLEDAPPS = (
    [...]
    'django.contrib.sites',
    'evserver',             # <<< THIS LINE enables runevserver command)
Now you can test your app using ./manage runevserver.
Full source code for the example django application is in the EvServer examples directory.

Web.py

From the 0.3 version Web.py supports returning iterators. You can see it in action here:
class webpyclock:
    def GET(self, name):
        web.header('Content-Type','text/plain', unique=True)
        environ = web.ctx.environ
        def iterable():
            sd = socket.socket(socket.AFINET, socket.SOCKDGRAM) # any udp socket
            try:
                while True:
                    yield environ['x-wsgiorg.fdevent.readable'](sd, 1.0)
                    yield "%s\n" % (datetime.datetime.now(),)
            except GeneratorExit:
                pass
            sd.close()
        return iterable()
The full source code is included in EvServer example directory . You can run this code using command:
evserver --exec "import examples.frameworkwebpy; application = examples.framework_webpy.application"

Summary

I haven’t discussed any useful scenario yet, I’ll try to do that in the future post. I’m thinking of some interesting uses for EvServer – pushing the data to the browser using COMET.

LShift is recruiting!


 

Web: Firefox tabs are finally usable

If you use Firefox, go and install the Ctrl-Tab add-on.

Tabs are great for reducing clutter, but they fail to make life much easier because the tab navigation doesn’t support the common patterns of use. For example, I end up opening the same page in multiple tabs because it is quicker to do that than to hunt for it in the existing tabs. Eventually even that is unmanageable and I have to manually garbage collect tabs.

Ctrl-Tab makes tabs usable again by fixing the navigation. The key combination Control-Tab cycles through the tab history (in the same way as task switchers), so that a single tap take you to the previous tab you visited, which is almost always the thing you wanted. Successive taps of Tab show thumbnails of the tab contents while you flip through them. Better still, Control-A (Control-Shift-a) shows thumbnails of all open tabs, and lets you filter incrementally by typing in the auto-focussed box, then select the tab with the arrow keys or mouse. In a single swoop it’s now much easier for me to switch to something I have open than open it again – the way it should be.

by
mikeb
on
12/11/08

Web: Adding distributed version control to TiddlyWiki

After my talk on Javascript DVCS at the Osmosoft Open Source Show’n'tell, I went to visit Osmosoft, the developers of TiddlyWiki, to talk about giving TiddlyWiki some DVCS-like abilities. Martin Budden and I sat down and built a couple of prototypes: one where each tiddler is versioned every time it is edited, and one where versions are snapshots of the entire wiki, and are created each time the whole wiki is saved to disk.

Regular DVCS SynchroTiddly
Repository The html file contains everything
File within repository Tiddler within wiki
Commit a revision Save the wiki to disk
Save a text file Edit a tiddler
Push/pull synchronisation Import from other file

If you have Firefox (it doesn’t work with other browsers yet!) you can experiment with an alpha-quality DVCS-enabled TiddlyWiki here. Take a look at the “Versions” tab, in the control panel at the right-hand-side of the page. You’ll have to download it to your local hard disk if you want to save any changes.

It’s still a prototype, a work-in-progress: the user interface for version management is clunky, it’s not cross-browser, there are issues with shadow tiddlers, and I’d like to experiment with a slightly different factoring of the repository format, but it’s good enough to get a feel for the kinds of things you might try with a DVCS-enabled TiddlyWiki.

Despite its prototypical status, it can synchronize content between different instances of itself. For example, you can have a copy of a SynchroTiddly on your laptop, email it to someone else or share it via HTTP, and import and merge their changes when they make their modified copy visible via an HTTP server or email it back to you.

I’ve been documenting it in the wiki itself — if anyone tries it out, please feel free to contribute more documentation; you could even make your altered wiki instance available via public HTTP so I can import and merge your changes back in.

by
tonyg
on
01/07/08

Web: diff3, merging, and distributed version control

Yesterday I presented my work on Javascript diff, diff3, merging and version control at the Osmosoft Open Source Show ‘n Tell. (Previous posts about this stuff: here and here.) The slides for the talk are here. They’re a work-in-progress – as I think of things, I’ll continue to update them.

To summarise: I’ve used the diff3 I built in May to make a simple Javascript distributed version-control system that manages a collection of JSON structures. It supports named branches, merging, and import/export of revisions. So far, there’s no network synchronisation protocol, although it’d be easy to build a simple one using the rev import/export feature and XMLHttpRequest, and the storage format and repository representation is brutally naive (and because it doesn’t yet delta-compress historical versions of files, it is a bit wasteful of memory).

You can try out a few browser-based demos of the features of the diff and DVCS libraries:

The code is available using Mercurial by

hg clone <a href="http://hg.opensource.lshift.net/synchrotron/">http://hg.opensource.lshift.net/synchrotron/</a>

(or by simply browsing to that URL and exploring from there). It’s quite small and (I hope) easily understood – at the time of writing,

  • the diff/diff3 code and support utilities are ~310 lines; and
  • the DVCS code is ~370 lines.

The core interfaces, algorithms and internal structures of the DVCS code seem quite usable to me. In order to get to an efficient DVCS from here, the issues of storage and network formats will have to be addressed. Fortunately, storage and network formats are only about efficiency, not about features or correctness, and so they can be addressed separately from the core system. It will also eventually be necessary to revisit the naive LCA-computation code I’ve written, which is used to select an ancestor for use in a merge.

The code is split into a few different files:

presets.preset1

for an example of how to use the DVCS, and

presets.ambiguousLCA

for an example of the repository format and the use of the revision import feature. * The diff and diff3 code itself. * Graph utilities (for computing LCA etc) * The DVCS and pseudo-file-system code. * The repository history-graph-drawing code and a python script for drawing the little tile images used in rendering a repository history graph.

by
tonyg
on
06/06/08
2000-13 LShift Ltd, 1st Floor Office, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK +44 (0)20 7729 7060