Posts filed under 'Tools'

Linux VServer: Cheap and Easy Virtualisation

Whilst projects like Xen and new hardware extensions to CPUs from Intel and AMD allow multiple OSes to run on the same machine at the same time, for me, there are currently few cases where I need this. I work under Linux and all I need is virtualisation to run multiple Linuxes at the same time. Also, virtualisation at the level of Xen requires that you set harddisc space and RAM for each running OS instance: the instances don’t share resources very well.

Linux VServer is virtualisation at a different level: there is only one Linux kernel ever running, but a chroot-on-steroids-like system ensures that you can start up multiple instances of linux and they do not interfere with each other in anyway possible. However, because it’s only one kernel running, the multiple instances do share resources such as RAM and harddisc partitions much more effectively. Having got some vservers up and running, they can be cloned, moved between machines, started and stopped easily and generally be manipulated very easily. You can do private networking between your vservers, and you can even get X up and running inside a vserver.

Once this is all up and running, it makes migration between different services very much easier. For example, last week I upgraded our bugzilla installation. In the past I’ve tended to upgrade our main installation in place which has been a bad idea in several cases. So this time, I copied our current installation onto a clean vserver and checked it worked. I then cloned that vserver and performed the upgrade on the clone, then fixing everything that broke. This meant that at all points I had a working copy of the original installation to refer to and that I could make sure I got the upgraded version at least to the same level of functionality as the old version before rolling it out on top of our main installation. The result was that I knew in advance all the “gotchas” of the upgrade before doing the upgrade on the main installation and consequently it went very smoothly. Almost as important is that as the upgrade is now complete, I can quite happily delete the bugzilla vservers as they’re no longer needed: because of the total separation of the vservers, this is very easy (much easier that trying to uninstall packages and delete databases) and it means that if you use vservers, you never have your main working environment polluted by the software you are working on.

It rather looks like I’ll be putting vserver on every machine I install from now on…

Add comment July 17th, 2006 matthew

Estimating the number of blog subscriptions

Unlike traditional website visitors, most readers of a blog use a news aggregator to periodically pull new items from the blog’s syndication feed. As a result, the co-relation between the number of requests and the number of times an item is read is broken, and to confuse things even more - many readers use a public aggregator service which saves the feed to a central repository and serves the saved entries to many readers. For such services, growth in the number of subscribers is not represented by an increase in the number of requests made.

To get a rough estimate of the number of subscribers to a feed we need to separate between requests made by public services on behalf of more than one user, and requests made by individual news aggregators.

If you too are curious about the number of subscribers to your blog (and have access to the HTTP access log of the server hosting it) you can give my little script, Blogalizer, a try.

Continue Reading 3 comments July 4th, 2006 Tom Berger

Static analysis of Erlang communication

I had a brief email exchange with the developers of Dialyzer, the static analyzer (some might call it a type checker) for Erlang programs. Currently Dialyzer only performs analysis on the functional fragment of Erlang and I was enquiring whether to extend that to handle communication. That would allow the detection of basic input/output mismatches, e.g. when a message is sent to a process that does not match any of the patterns it is willing to receive.

Going further, one might be able to employ the various techniques developed for process algebras to reason about the concurrent behaviour of Erlang programs and, for example, detect deadlocks and enforce information flow security properties. A good example of such a tool is TyPiCal. It would be amazing to have something like that for Erlang. After all, what makes Erlang interesting is not the functional programming aspect currently checked by Dialyzer, but its support for concurrency, distribution and fault-tolerance. It is incredibly difficult to correctly implement systems that involve the latter. If there is any area of programming in which we want the help of static analysis then this is it!

Anyway, it turns out that there are no immediate plans to extend Dialyzer in that direction. However, I was pointed at some related research that I had hitherto been unaware of: Karol Ostrovský’s PhD thesis, which in Part II describes the

  • sound instantiation of Kobayashi’s generic type system for the pi-calculus to session types,

  • extension of session types to multi-session types (which, afaict, handle sessions that involve asynchronous comms, and servers that handle multiple sessions without spawning),

  • application of multi-session types to type check communication of Erlang processes.

Overall this looks like a promising attempt at constructing a process-algebra-based type system that is decidable and yet expressive enough to reason about non-trivial real-world protocols (IMAP4 is used as an example). The theory behind it seems to be quite involved, but that could just be due to the presentation format - a thesis rather than a paper. It will be interesting to see whether this research is carried any further and eventually materialises in tools for Erlang.

Add comment June 26th, 2006 matthias

Overview of Javascript modes for Emacs

Emacsen.org has a nice roundup of the (apparently only) four javascript-mode implementations for Emacs. I went for number three, Karl Landström’s javascript.el, and it’s been working very well.

1 comment June 24th, 2006 tonyg

S5: A Simple Standards-Based Slide Show System

Whilst discussing presentation software the other day with a colleague, he kindly pointed me at S5. It is simple, it is standards-based, and it is a slide show system. In short, it does exactly what is says on the tin:

S5 is a slide show format based entirely on XHTML, CSS, and JavaScript. With one file, you can run a complete slide show and have a printer-friendly version as well. The markup used for the slides is very simple, highly semantic, and completely accessible. Anyone with even a smidgen of familiarity with HTML or XHTML can look at the markup and figure out how to adapt it to their particular needs. Anyone familiar with CSS can create their own slide show theme. It’s totally simple, and it’s totally standards-driven.

Add comment June 21st, 2006 lee

Selenium

I have been looking for something to help me do some automated functional tests on an existing web application, and came across Selenium. It is a framework that executes tests from within a browser using a combination of Javascript and DHTML. It supports IE, Firefox and Mozilla, thus allowing browser compatibility testing.

There are three Selenium components:

The latter component provides an easy way in and so I looked at this first.

Selenium IDE

I decided to use a source build (2006-05-19) as the project looked active and I like to tinker. The build was very straightforward and once installed there was a “Selenium IDE” entry in the tools menu. Clicking this brings up the IDE window:

Blank IDE screenshot

I tried it out on the BBC News site as it is publicly available and has good quality HTML. The “record” button in the IDE is depressed at startup so I just entered http://news.bbc.co.uk/ into the location bar in my browser and the site appeared. Nothing happened in the IDE window though - it waits for the first action within the site itself before updating. For the first test suite, I limited myself to simple link checking. Clicking on “World” in the BBC site navigation brings up the page as expected, but also updates the IDE window:

IDE after first click screenshot

The base URL has been filled in. All test commands apply to relative paths meaning that tests can be run on development, stage and production servers just by changing this field. The actions have also been populated:

  • open; command to open a URL. This can be relative or absolute but as mentioned above, relative URLs will allow your tests to be used on different hosts.

  • assertTitle; after each page load, Selenium IDE will automatically insert an assertion to check the title. This can be disabled in the “Options” dialog.

  • clickAndWait; clicks an item in the page. The “AndWait” suffix is available on several of the commands and forces the IDE to wait for a page load before continuing. Depending on how a link is constructed, for example, where Javascript is used to load the next page, the IDE may not recognize that it has to wait and may just record a click command. In these cases you must manually change the command otherwise the subsequent tests may be applied to the wrong page.

The target for the click is shown as link=World which means Selenium looks for a hyperlink with the visible text of “World”. The Selenium Core reference has more information on how elements can be located.

To build up the test I continued to go through and click the other navigation links, i.e., UK, England, Northern Ireland, etc. I then clicked the “Play” button and Selenium directs the browser through the path just taken and checks the specified assertions. Commands that succeeded are coloured green in the IDE and those that failed will turn red:

IDE after test execution screenshot

Selenium supports several formats for test scripts but the IDE supports just two; HTML (also known as Selenese) and Ruby. HTML is the easiest to parse for non-programmers, and the default. It is visible in the “Source” tab:

IDE source tab screenshot

I saved this test to disk and created a new blank test.

Selenium also supports forms and so in the search box at the top of the BBC News website, I entered theo and clicked “Search”. The IDE shows these actions including the command type. The target is listed as just q which is the name of the input element where the typing should go:

IDE after search screenshot

One thing to note is that the BBC uses a separate host for searching (search.bbc.co.uk). The tests will run in the IDE because it is a Firefox extension, but they will not work in Selenium Core or RC because Javascript is security restricted to making calls to the page origin host and port only (See “The Same Origin Policy“).

The football player, Theo Walcott, has been in the news a lot so I want to assert that his name comes back in the search results. I highlight the text “Walcott”, right click on the selection and then click “assertTextPresent Walcott”:

IDE context menu screenshot

This creates an entry in the IDE that checks that the text “Walcott” is present somewhere in the document:

IDE after asserting text present screenshot

It is possible to do a more precise check for text. Looking at the web page source code, the search summary “Page 1 of 145 pages for theo” is contained within a paragraph. In the IDE, I click the row below the last command in the IDE and below the test list, select “assertText” in the Command column. In the Target column, I enter the XPath that identifies that paragraph element: //p[@class='bodymainResults allmainr borders']. Clicking on “Find” in the IDE will flash the selected node (this requires the DOM Inspector to be installed):

IDE highlight element screenshot

Because the actual number of pages will change, I use a wildcard to match this part of the text and enter Page 1 of * pages for theo in the “Value” column:

IDE after wildcard text check

Selenium Core and RC

The Selenium Core component contains the main test runner. A web page within the test runner directory provides the necessary controls to run the tests.

Because of the same origin restriction and the fact that I could not deploy Selenium Core onto the BBC web site, I used Selenium RC to automate the test running. As well as a server that can launch browsers and initiate the tests, it contains a proxy server that makes it look like the Selenium Core test runner is hosted on the target web site.

Both of these components can be driven using various languages, e.g., Java, .NET, Python, Ruby. They can also use the HTML style tests that were saved from the IDE and this is the most interesting method of test execution for me at the moment as it allows non-programmers, i.e., not me, to build up test suites.

Selenium RC works with test suites rather than the test files themselves. A test suite file is a HTML document with a table referring to the test files:

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">
    <title>Test Suite</title>
  </head>
  <body>    
    <table id="suiteTable" cellpadding="1" cellspacing="1" border="1">
      <tbody>
        <tr><td><b>Test Suite</b></td></tr>
        <tr><td><a href="./check-nav-links.html">Check nav links</a></td></tr>
      </tbody>
    </table>   
  </body>
</html>

I put the test suite and test files into a directory along with “server/selenium-server.jar” from the Selenium RC distribution. I launched the server and tests via the following command line (Windows):

java -jar selenium-server.jar -htmlSuite "*firefox" "http://news.bbc.co.uk" "C:\selenium-rc\test-suite.html" "C:\selenium-rc\test-output.html"

This launches the proxy server, forwarding non-Selenium requests onto “http://news.bbc.co.uk”. It then launches the designated browser, in this case, Firefox. Internet Explorer can be launched by substituting "iexplore” for "firefox”. I found that occasionally IE would try and run the tests before the proxy server was fully functional and would therefore trigger a 404 error (Release 0.7.1. Release 0.8 is already out and so this may be fixed). This is because the Selenium specific URL does not exist, e.g.:

http://news.bbc.co.uk/selenium-server/TestRunner.html?auto=true&test=http://news.bbc.co.uk/selenium-server/test-suite.html

Other browsers can also be launched and the documentation has details on how to do this. When using Firefox or IE, a new profile is created in the current working directory.

Once the browser is launched, it ran the core test runner via the proxy:

Selenium test runner screenshot

A HTML copy of the report was written to disk when it finished.

It is also possible to run the server in interactive mode. The documentation gives some good examples of this.

Conclusion

Selenium provides a quick and powerful way to write functional tests for web applications, and an environment where they can be run across different hosts. I am sufficiently impressed that I am looking for projects which could benefit from this approach.

Ideally, I would like the tests to integrate with Ant, NAnt or Maven. Because Selenium RC can be driven via Java, this should not be a problem; use a JUnit TestDecorator to start and stop the proxy server and JUnit setup and tear down methods to control the browser instances. Reading the tests from HTML format is a little trickier but a new major version has just been released and contains a tool to convert Selenese HTML tests to Java.

No doubt there are other goodies in the new release too which I shall report on in due course.

10 comments June 8th, 2006 lee

link checker

I was looking around for an easy-to-use, no-fuss command line tool to check the links on a web site. First I tried wget:

wget -o wget.log -nv -r -p <site>

The resulting wget.log contains all the links that were followed. It’s easy to spot the errors but there is no obvious way to get hold of the referrer.

Next was linkchecker:

linkchecker -t3 --no-warnings -Fblacklist/blacklist.out http://<site> > linkchecker.log

This produces a list of broken links in blacklist.out. There is no referrer information in that, but one can get hold of it by cross-referencing the full log in linkchecker.log. That is not entirely trivial though; it’s certainly beyond grep. More significantly, linkchecker seems to run forever and checking the same links over and over again - I gave up after it had spent 1 hour and checked 100,000 links on a site that contains no more than a few hundred actual links.

Finally, I tried linklint:

linklint -error -warn -xref -forward -out linklint.out -net -http -host <site> /@

This completed in a few minutes and produced a nice report in linklint.out. The report contains a summary of the kinds of links, files and errors found, a per-referrer break-down of all broken links, and a list of all moved URLs referenced by the site. This is pretty much exactly what I was after!

All three tools are available as debian packages. linklint development seems to have stopped a few years ago, yet it was the best of the bunch for what I was trying to achieve. YMMV.

1 comment June 8th, 2006 matthias

Colour Terminals in (X)Emacs

Some Unix command line tools display text in colour, if you run them in the right kind of terminal, of which the Emacs shell isn’t one. So far this had not really bothered me since in most cases the colours do not convey all that much information. However, recently I was playing with Maude, which colours terms for debugging purposes. There is no easy alternative for getting hold of the same information.

A few minutes of googling and experimenting produced the magic Emacs command

(require 'ansi-color)

followed by customisation of the ansi-color-for-comint-mode variable (setting it to t) or calling (ansi-color-for-comint-mode-on). This applies across all comint sessions, not just shell sessions, so it also affects things like the interactive Maude mode.

Under Emacs one can go further with M-x ansi-term, which creates a fully-fledged ansi terminal inside an emacs buffer. Finally we can run Emacs inside Emacs inside Emacs …! Ansi-term is not available for XEmacs, probably because it is some ghastly hack.

2 comments May 10th, 2006 matthias

Darcszilla - Darcs hooks on patch application

We’ve been using cvszilla here at LShift, and one feature we like that we want in Darcs is the ability for the version-control system to append notes (containing the log message) to a Bugzilla bug record every time a commit is accepted by our central repository.

I asked on the darcs-users mailing list, and the response I got indicated that the required functionality isn’t yet implemented. I’ve just pulled the unstable Darcs source tree. If I make any progress, you’ll hear it here first…

5 comments May 9th, 2006 tonyg

Java memory profiling with jmap and jhat

My colleagues and I have just spent over a week tracking down a repeated OutOfMemoryError in a fairly complex web application. In the process we looked at the jmap and jhat memory profiling tools for the JVM.

Continue Reading 6 comments March 8th, 2006 matthias

Next Posts Previous Posts

Calendar

October 2008
M T W T F S S
« Sep    
 12345
6789101112
13141516171819
20212223242526
2728293031  

Posts by Month

Posts by Category