The problem with markups is that when I see a typo in a rendered output, I have to click through the text and search for exact place with the mistake. I have the same feeling about editing Wikipedia, documentation on code.google.com, Trac, Blogger, WordPress and so on.
But I hate writing in WYSIWYG editors even more. Almost all graphical editors generate crappy output: badly closed html tags, broken styles, stripped white space. Considering this problems I usually try to stay with markups.
Next problem is that I’m the only person that can fix mistakes in my texts. My friends tell me about typos, but I have to fix them by hand. I tried to share texts on google docs, but the collaboration doesn’t work well enough.
A few months ago I saw an online real-time editor Etherpad. That’s quite a cool toy. It solves the problem of sharing the text with my friends, but it doesn’t support any markups – it’s just a plaintext editor.
But I know how to create Comet applications easily using EvServer and Django. I realized that I could build a simplified Etherpad clone, which supports a markup language!
The Etherpad clone
I decided to spend a very limited time on this project. Actually I wanted to do everything in my spare time in a week, that is about 6 afternoons.
Features I wanted, ordered by importance:
must support editing by many users in real time – like Etherpad
must generate rendered markup with reasonable latency
must support all major browsers (though IE and Konqueror are not a must)
must be dead simple
should be able to scale up (for reasons I’ll describe below)
should show who created what – like Etherpad. In the end I dropped this requirement due to the limited time.
With such hard time constraints I was ready to make some technological decisions:
Python on the server side
EvServer as a server
Haproxy as a loadbalancer
use EvServer’s Comet transports
RabbitMq as a message broker
MemcacheDB as a database
Memcache for temporary storage
Support reStructuredText as a markup language
The hardest part of the project is synchronizing and merging updates from many clients in real time and fixing collisions and propagating changes to users. Fortunately Neil Fraser from Google solved this problem and published it as a very nice project diff-match-patch.
It seems that the only job I have to do is to glue this parts together.
A few days after Etherpad was launched I saw this dialog:
It seems that Etherpad failed to scale. I don’t expect my clone to be popular, but it would be nice to scale better, even if it’s only in theory.
So I built the application with scalability in mind. The major hub in the application is the message broker – RabbitMQ. As long as this AMQP server scales – my application will also do. In the end RabbitMQ ought to be scalable – it was build exactly for that.
To synchronize user changes between browsers with low latency EvServer application shouldn’t do any jobs requiring high CPU usage. Managing and comparing changes from browsers is quite fast. The tough part is to generate the rendered output from the markup. Rendering reStructuredText markup to html can take up to few seconds of processor time! I decided that this job should be done by external process that can afford greater latency. Except for that, the message flow is pretty standard for a real-time web application:
Burning the time
Once I decided what I wanted to build I was ready to start programming. I haven’t got a detailed plan, but I accomplished the task almost on time. This is roughly the time I used and how the effort broke down:
1st evening: creating initial html view
3rd evening: created basic message flow – the latency counter started to work
waiting at the airport and flight: integrating the diff-merge-patch library, synchronization starts to work
4th evening: fixed message routing in amqp, integrated the persistent database
all Sunday: making it work on IE and final fixes
Here you can play with the final application. It’s not especially impressive, but it works. I established my goals and I proved that using proper tools it’s possible to build simplified Etherpad clone in just few days. I must admit that creating this app was a great fun.
I even start to think that the application could be useful. Maybe I should consider adding more popular markup languages like Markdown, Wiki or Trac.
The largest problem i have seen with collaborative editors is the undo function. while i would love to move my coding to a collaborative system, i can’t do it without an undo button. i know it gets more complicated with the design with multiple users, but it would certainly be nice.
Nice to see someone else using a live markup preview!
Collision resolution seems like a lot of wasted effort for collaborative editors. Users want to edit with other people, but just about never want to type in the exact same place at once.
We built a little editor called draftastic_ around live markup much like yours, but we do locking on arbitrary sections to keep users from ever colliding. (Which also has the handy side-effect of making per-user change logs and undos easy.)
@Erik: Orbited is used somewhere in production, while Evserver is a alpha project yet. The biggest difference is that Orbited is (or at least was, I haven’t checked it for a while) a proxy, that has to be in front of the normal server. While Evserver can be set on separate machine with different subdomain. This allows you to smoothly develop comet code, without interfering with your normal http traffic.