Diff for Javascript

August 15th, 2006 tonyg

Toward the end of last week, I found myself wondering about implementing a diff algorithm in Javascript. It turns out there’s already at least one available attempt at this, by John Resig (direct link: jsdiff.js). Unfortunately, the implementation is a little buggy (try diffString("the quick", "the quick fox"), for example), so I went hunting for source material to implement my own.

I ended up finding J. W. Hunt and M. D. McIlroy, “An algorithm for differential file comparison”, Bell Telephone Laboratories CSTR #41 (1976) (postscript), which turned out to be quite implementable.

Here’s my implementation: hunt-mcilroy.js. With luck, I’ve even implemented something close to the described algorithm.

As an example, the output from

diff("the quick brown fox jumped over".split(/\s+/),
     "the quick fox jumps over".split(/\s+/))

is

[{common:["the","quick"]},
   {file1:["brown"],
    file2:[]},
 {common:["fox"]},
   {file1:["jumped"],
    file2:["jumps"]},
 {common:["over"]}]

The next step is to implement a diff3 equivalent, and then I’ve the tools to implement rudimentary version-control. It’d sure be a fine thing to see TiddlyWiki evolve darcs-like distributed change-propagation…

Entry Filed under: Technology

6 Comments Add your own

  • 1. mikeb  |  August 27th, 2006 at 8:48 am

    Also see http://neil.fraser.name/software/diff_match_patch/

  • 2. frank  |  May 22nd, 2007 at 4:48 am

    How is your diff3 going on ?
    If it finish , may I see it’s source code by JS .

    Thanks alot

  • 3. tonyg  |  May 22nd, 2007 at 1:15 pm

    Hi Frank, unfortunately I’ve not yet returned to this little project. I’ll be sure to post when I do, though.

  • 4. frank  |  May 22nd, 2007 at 5:01 pm

    OK , perhaps i’ll try to do a new one by myself . thanx

  • 5. Peter Pallos  |  February 19th, 2008 at 11:14 pm

    The code fails with the two sentences that I found on a different site:

    str1 = “The red brown fox jumped over the rolling log.”;
    str2 = “The brown spotted fox leaped over the rolling log”;
    alert(uneval(diff(str1, str2)));

    For example, a part of the result array is:
    {common:[”o”, “v”]}, {file1:[”e”], file2:[”e”]}, {common:[”r”]}

  • 6. tonyg  |  February 20th, 2008 at 8:23 pm

    You’re right, Peter - computing a diff character-by-character seems to give spurious diffs! How odd. Have you dug into the code to find out why that might be?

    Splitting the string into words first gives a sensible answer:

    js> uneval(diff(str1.split(/\s+/), str2.split(/\s+/)));

    [{common:[”The”]},
    {file1:[”red”], file2:[]},
    {common:[”brown”]},
    {file1:[], file2:[”spotted”]},
    {common:[”fox”]},
    {file1:[”jumped”], file2:[”leaped”]},
    {common:[”over”, “the”, “rolling”]},
    {file1:[”log.”], file2:[”log”]}]

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

Trackback this post  |  Subscribe to the comments via RSS Feed

Calendar

August 2006
M T W T F S S
« Jul   Sep »
 123456
78910111213
14151617181920
21222324252627
28293031  

Most Recent Posts