technology from back to front

Archive for the ‘Programming’ Category

Optimising compilers as adversaries

Suppose that you want to handle some secret data in C and, in the wake of some high-profile vulnerability or other, want to take precautions against your secret being leaked. Perhaps you’d write something along these lines:

#include <string .h>

typedef struct {
  char password[16];
} secret_t;

void get_secret(secret_t* secret);
void use_secret(secret_t* secret);

void wipe_secret(secret_t* secret) {
  memset(secret, 0, sizeof(secret_t));
}

int main() {
  secret_t secret;
  get_secret(&secret);
  use_secret(&secret);
  wipe_secret(&secret);
  return 0;
}

I think you could be forgiven for assuming that this does what it says. However, if you have what John Regehr calls ‘a proper sense of paranoia’, you might actually check. Here’s an excerpt of what I got when I ran clang -S -O2 -emit-llvm on this example:

define i32 @main() #0 {
  %secret = alloca %struct.secret_t, align 1
  call void @get_secret(%struct.secret_t* %secret) #4
  call void @use_secret(%struct.secret_t* %secret) #4
  ret i32 0
}

As if by magic, wipe_secret has completely disappeared.

Read more…

by
ash
on
30/06/14

Super-simple JavaScript inheritance

JavaScript uses prototype-based inheritance, which can prove a bit of a puzzler for those of us used to class-based object orientation. At first glance it seems like it’s basically the same and as if it can be used in very nearly the same way. If you pretend that those prototype objects are in fact classes and ignore the nuances you can get surprisingly far and code up a decent-sized heap of working code. However you will eventually be bitten and have to read up on what’s really happening at the nuts and bolts level. This is guaranteed to happen just when you’re under pressure, trying to get a critical feature working. It’s a horrible feeling, the dawning realisation that the subtle bug you can’t grok is because things don’t work the way you thought at a very basic level and that your heap of code is founded on quicksand. This happened to … a friend of mine.

This post isn’t going to attempt to explain the depths and subtleties of JavaScript’s prototype model. Plenty of others have been there before. In fact we will embrace our class-based stubbornness and attempt to get it working the way we really wanted. Plenty of others have done this too, but there are a few issues with most of their solutions:

  • They are too simplistic and don’t cover all the cases required, like having an arbitrarily deep hierarchy that can call up the constructor chain neatly
  • They are too complicated, having developed into powerful libraries with many features
  • The perennial problem: I didn’t write them, so am not in control and able to understand exactly what’s going on and adapt to exactly my needs – no more, no less.*

I present the result below, wrapped up for require.js. There is really very little code indeed – just two functions: inherit and superConstructor.

// Because class is a keyword in JS, consumers should inject this as clazz.
define(function() {

  return {
    // In the unlikely event that you need to explicitly call a superclass implementation
    // of a method, because a method with the same name exists in the current class:
    //  foo.parent.bar.call(this, x, y);
    inherit: function(child, parent) {
      child.prototype = Object.create(parent.prototype);
      child.prototype.constructor = child;
      child.prototype.parent = parent.prototype;
    },

    // The superclass constructor should generally be called from child's constructor
    // otherwise it won't run and fields defined there will be missing:
    //   superConstructor(this);
    superConstructor: function(self) {
      // The constructor that we call here may in turn wish to call superConstructor() to
      // call its own parent's constructor (but with the same 'self') so we must take
      // special measures to allow this, as self will be the same object with each recursion.
      var constructor = (self.nextParent) ? self.nextParent.constructor : self.parent.constructor;
      self.nextParent = self.parent.parent;
      constructor.call(self);
      self.nextParent = undefined;
    }
  }

});

The contents of inherit are much as you’ll find in many a blog post, though there’s a surprising amount of subtle variation out there!

More interesting is superConstructor, which employes a somewhat offensive tactic to allow calls all the way up the constructor chain. What makes this difficult is that ‘this’ must remain the actual object being constructed throughout those nested calls, so we need to manually provide the context to know what the next constructor up the chain is.

Having done this and saved the code above into clazz.js, we can write code with inheritance as follows (working example as a jsfiddle).

// A Dog can bark.
function Dog() {
    console.log('Constructed a dog');
}
Dog.prototype.bark = function() { return 'Woof' };

// A Yorkie is a Dog that barks a lot!
clazz.inherit(Yorkie, Dog);
function Yorkie() {
    var self = this;
    clazz.superConstructor(this);
}
Yorkie.prototype.bark = function() {
    var noise = this.parent.bark.call(this);
    return noise + noise + noise;
};

// Create dogs and see what noises they make.
console.log(new Dog().bark());
console.log(new Yorkie().bark());

To be fair, my super-simple inheritance library is extremely restricted in its abilities, for instance not handling constructor parameters. But that’s because I didn’t need them, and any extra features should be easy to add. Most of all it was a valuable learning experience.

* Actually I love an off-the-shelf library as much as the next chap (or chappess) – but if you don’t feel comfortable with the libraries on offer and the problem seems nicely tractable and a worthwhile learning experience then why not go for it. You can always change your mind.

by
Sam Carr
on
16/06/14

CSS Transitions can’t animate display change

I’d like to demonstrate a fairly simple CSS issue that caught me out, and the straightforward solution. Put simply CSS Transitions do not work if there is a change in the display property as part of the same change that fires the transition, but you can workaround this by separating out the display change.

If you’re not already aware, CSS Transitions are a cute way of animating transitions on your web page. Simply add a transition property in your CSS stating which property of the element should be animated when it changes, and over what period of time.

.animatedWidth {
    transition: width 2s;
}

In the example above, whenever the width of the element is changed (e.g. programmatically from JavaScript) it will animate that change over 2 seconds, complete with ease-in and ease-out by default.

I’ve created a jsfiddle with a more convoluted example that demonstrates the display problem, so you can inspect the HTML, CSS and JS, and run it in the browser. The example has three coloured bars (though the second two start off invisible) and an Animate button. Click the button and you’ll see that the ordinary transition animates the width of the bar as expected, but where the coloured bar is being made visible at the same time it just winks into existence in its end state with no animation. The third bar appears and then animates correctly, because our JS separately shows it then triggers the animation. It uses a timeout with zero delay to achieve this, effectively giving the rendering engine its chance to handle the display change before then triggering the animation.

button.on('click', function() {
    // To get the animation working we need to change the
    // display property first (via jQuery toggle()) and then
    // trigger the CSS transition with a zero-delay timeout.
    bar3.toggle();
    window.setTimeout(function() {
        bar3.toggleClass('animate');
    }, 0);
});

In my real world situation where I first stumbled across this effect, the item being animated started offscreen (and invisible) and slid into place, with the problem only evident on Chrome for some still unknown reason. The change of display property was but one of many things going on via incidental CSS so it took some sleuthing to figure out that it was responsible for the problem. Coming at it from that baffling angle for the first time, the problem and its solution were not nearly so obvious as presented above!

by
Sam Carr
on
27/05/14

A simple Knockout page router

Knockout.js is a pleasantly simple approach to data-binding ViewModels into your HTML. Like many JavaScript libraries it sticks to a core mission with a few simple concepts, which makes it quite approachable. Its simple template support means that you don’t need to write much code to get a top-level page router going in your single page app (SPA) and that’s exactly what I have done.

Knockout-routing

It uses hash-based routing, so URLs must be of the form http://foo.com/index.html#myPage. This approach means that even a statically hosted site with just the one real URL (index.html in this example) and zero server-side dynamicism can be a SPA with multiple virtual pages. All requests will ultimately come to index.html and then the router takes over and shows the right actual page based on the hash in the URL. Back and forward buttons work, as does page refresh, bookmarking, emailing links etc.

The code is on GitHub, with a decent README explaining the features and the key files to look at, so I won’t repeat that here. The code is also well-commented, with the intention that you can (and should) read it to see how it works. You can clone it, then simply double click src/index.html to open it in your browser and see its capabilities demonstrated. Nice and easy.

The router itself is just a 61 line JavaScript file, which would be very easy to extend with further features that you might need. The rest of the code on GitHub shows how to use it by example, and demonstrates all of its features.

Any feedback is very much appreciated. I imagine there are other similar routers out there, but this one is mine and making it (and using it in anger) taught me a lot and provided a nice, tight result which I can easily add to as required.

by
Sam Carr
on
30/04/14

Getting back into front-end web development

I’ve been working on a small SPA (Single Page Application) – just HTML, CSS and JavaScript statically served and doing its thing entirely in the browser. I learned a great deal throughout the project, but here are some of the things that strike me as most valuable.

Get a good workflow going

I used Grunt to setup a nice build system that is mostly a joy during development. It took a while to evolve my Gruntfile, but now when I edit a file, the results are immediately refreshed in the browser (I don’t even have to hit cmd-R). I can deploy to S3 test, staging and live sites with a single command that takes about 3 seconds. My SASS files are compiled down to minified CSS, my JS is minified with source maps etc.

The hardest part of using Grunt is figuring out how to configure it and its many contrib plugins. I could have done with a definitive reference or perhaps I could have used Yeoman to give me an out of the box solution. However I recognised that I was always going to have to figure out the guts of Grunt so I think I really was better off bespoking it from the start. I’m glad I did as now I have a tight setup that does precisely what I want and that I understand completely.

Now it seems there is a new kid on the scene, Gulp – nicely introduced in this tutorial blog post. I will definitely be looking closely at that for my next project, with the piping approach looking like the key step beyond Grunt, along with nicer syntax. I’d also look at Browserify, for a nicer way to piece together the JS bits.

Learn JavaScript properly

To the uninitiated, JavaScript is fairly surprising in many subtle ways, and though I can grok the prototype-based inheritance fairly easily, the scoping rules caught me out repeatedly. This was especially the case as I tried to create JQuery plugins with private methods and state. Eventually a simple old article by grand-daddy of JavaScript writing Douglas Crockford gave me the vital clues I was missing.

Really I should just read his book, and I would recommend that anyone else doesn’t just attempt to learn JavaScript as they go, but takes some time to pro-actively figure out the core concepts – it will pay off in short order.

jQuery is non-negotiable

And the award for most indispensable library goes to: jQuery. Seriously, it should be baked into the browsers or the ECMAScript standard. The nicest thing about it is I can pretty much just guess at the API and be right most of the time, though reading the docs usually reveals new conveniences that go further than I even imagined.

Browser quirks can be a living nightmare

JavaScript itself is fairly reliable, especially with judicious use of mature libraries like jQuery that paper over the cross-browser DOM cracks. CSS in complicated scenarios is where it all seems to go wrong however.

It’s amazing how broken/different some browsers are. Here are just a few highlights, though every day brought tens of new oddities and associated workarounds.

  • Mobile Safari on iOS 7 reports the viewport height inconsistently (depending on how you access it) leading to bad layout and horrible JavaScript workarounds.
  • Use of -webkit-overflow-scrolling:touch causes the hardware accelerated renderer to kick in, resulting in various flickers, flashes and flitches with content not rendering.
  • IE 10 on Windows 8 shows back/forward overlays at the left/right of the screen when your mouse moves near them, obscuring links in those locations.
  • Chrome running on Retina Macs suffers from strange graphical glitches when running CSS Animations, but is fine with CSS Transitions. However other browsers/platforms really need CSS Animations to get smooth, hardware accelerated movement. In my case it was necessary to implement both approaches and select using browser detection.
by
Sam Carr
on
06/03/14

A study in Scala

Cards on the table: I like Scala. Right now it’s my go-to general purpose programming language, but I know that many people have a dim opinion of it, or appreciate its positives but are heavily conflicted by its negatives. Actually I do put myself in that latter camp, though I think I’ve got deep enough into learning the language that I can get past most of those negatives. Here I present a classic tale of “life with Scala”.

The problem

Scala generally has collection methods for most things you can think of, so I was pleased but not surprised to find inits and tails methods, that list all of the prefixes and suffixes respectively. These methods return an iterator, hence the use of toList below, to actually see the results. For the rest of this post, we’ll just consider inits, though tails is trivially similar.

scala> List(1,2,3).inits
res1: Iterator[List[Int]] = non-empty iterator

scala> List(1,2,3).inits.toList
res2: List[List[Int]] = List(List(1, 2, 3), List(1, 2), List(1), List())

scala> List(1,2,3).tails.toList
res3: List[List[Int]] = List(List(1, 2, 3), List(2, 3), List(3), List())

The results include the empty collection, but as it happens that doesn’t suit my needs. I want a version that doesn’t return the empty list.

What to do?

Options include:

  • Call the existing inits method then strip off the last item. However since we get an iterator this isn’t quite so trivial – we can’t just do inits.init
  • Write a new custom method to do just what I want and return a collection, not an iterator.

I decided, in the spirit of Scala, that I would add a nonEmptyInits method to ‘all collections’ (and I’m deliberately vague here) via an implicit conversion, so that it can be called just as easily as inits. It would call inits but wrap the iterator to exclude the final element. Easy peasy, but the rabbit hole loomed ahead of me.

Because I was over-engineering and trying to be purist (this was hobby code) I wanted to implement my new method for the most generic type reasonable. Figuring out what type that was challenged me rather. The Scala collections API has a fairly complicated hierarchy. This is somewhat diagrammed and explained in the documentation but that doesn’t show all the interim traits and classes that are in the guts of the implementation such as TraversableLike, GenTraversable, GenTraversableLike and 10+ others relating just to Traversable. The API is setup to support some very clever facilities, like generic collection methods typically returning collections of the same type you started with. Apparently most libraries can’t/don’t do this. However it made it hard for me to find the right place to start. Ultimately I settled on Traversable, since inits itself is defined in TraversableLike, but adding to that directly seemed like more trouble than it was worth.

implicit class TraversableExtras[T](t: Traversable[T]) {
  def nonEmptyTails: Iterator[Traversable[T]] = {
    val bufferedInits = t.inits.buffered
    new Iterator[Traversable[T]] {
      def hasNext() = bufferedInits.head.nonEmpty
      def next() = bufferedInits.next()
    }
  }
}

My solution above wraps the iterator with a buffered version, so that I can peek at head and determine whether to end early or not. This is fairly nice and simple and I’m quite happy with it. It means we can do the following, via the magic of implicit conversions.

scala> List(1,2,3,4).nonEmptyTails.toList
res7: List[Traversable[Int]] = List(List(1, 2, 3, 4), List(1, 2, 3), List(1, 2), List(1))

But Arrays mess it all up

This is all looking very promising and straightforward, but it took a fair bit of banging my head against the desk to figure out why the compiler wouldn’t let me call it on an Array.

scala> Array(1,2,3).nonEmptyTails
:9: error: value nonEmptyTails is not a member of Array[Int]

To cut a long story short, though we can seamlessly treat Array as a Traversable, Array is actually somewhat special. It maps to Java arrays for implementation reasons, but supporting generics, and with an implicit conversion to WrappedArray, which is a Seq (and hence a Traversable) to provide all the collection methods we’re used to. So far so good, but it turns out that the compiler cannot apply more than one implicit conversion at a time, as the search space would explode. Hence it cannot convert from Array to WrappedArray to my TraversableExtras. A solution in such a case, where we want two conversions, is to explicitly cast the Array, to make the first conversion explicit, then the compiler does the rest.

scala> (Array(1,2,3): Traversable[Int]).nonEmptyTails
res10: Iterator[Traversable[Int]] = non-empty iterator

Wrap-up

Finally I got where I wanted, even if that cast is a bit irritating, and I learned a lot of useful things in the process – all part of the journey of familiarity with any language. But it struck me as soon as I got a bit lost in the Scala collections hierarchy that I was on a mission with which I am all too familiar. It was at that point that I decided I would probably blog about the experience and started taking notes!

I still like Scala, but I’m very well aware that this is the price I pay, and I’m not surprised that others find it too high a price.

Just to be clear…

I realise by the way that this was all a fools errand in the first place. I deliberately went down the path described above as an explicit learning exercise because I wanted to get more experience with the collections API, iterables, implicits etc. It is possible to achieve the result more directly with a filter:

scala> Array(1,2,3).inits.filter(_.nonEmpty).toList
res16: List[Array[Int]] = List(Array(1, 2, 3), Array(1, 2), Array(1))

It’s not quite as pithy as having a nonEmptyTails method, but in all other senses it’s probably superior.

by
Sam Carr
on
10/02/14

Grunt uglify file specs

I struggled a bit finding relevant examples of Gruntfile configuration for Uglify, so having solved a few specific problems myself, here’s what I came up with.

This is just a snippet from the whole Gruntfile of course, and contains half-decent comments already, though I’ll provide some extra explanations below to point out the most interesting bits.

// Variables used internally within this config.
conf: {
  app: 'app',
  dist: 'dist',
  // Just our own custom scripts.
  script_files: ['scripts/*.js'],
  // All scripts that should be minified into final result.
  // Ordering is important as it determines order in the minified output and hence load order at runtime.
  // We don't include jquery (though we could) as it's better to get it from Google where possible.
  minify_js_files: [
      'scripts/vendor/modernizr/modernizr.custom.js',
      '<%= conf.script_files %>',
      'scripts/polyfills/**/*.js']
},

uglify: {
  options: {
    banner: '/*! <%= pkg.name %> <%= grunt.template.today("yyyy-mm-dd") %> */\n',
    sourceMap: '<%= conf.dist %>/scripts/source.map.js',
    sourceMapRoot: '/scripts',
    sourceMappingURL: '/scripts/source.map.js',
    sourceMapPrefix: 2
  },
  // For dev, effectively just concatenate all the JS into one file but perform no real minification.
  // This means that the HTML is the same for dev and prod (it just loads the single .js file) but
  // debugging in the browser works properly in the dev environment. It should work even when fully
  // minified, given the source maps, but practice shows that it doesn't.
  dev: {
    options: {
      report: false,
      mangle: false,
      compress: false,
      beautify: true
    },
    files: [{
      expand: true,
      cwd: '<%= conf.app %>',
      src: '<%= conf.minify_js_files %>',
      dest: '<%= conf.dist %>/scripts/main.min.js',
      // Because we want all individual sources to go into a single dest file, we need to use this
      // rename function to ensure all srcs get the same dest, otherwise each would get a separate
      // dest created by concatenting the src path onto dest path.
      rename: function(dest, src) { return dest; }
    }]
  },
  prod: {
    options: {
      banner: '/*! <%= pkg.name %> <%= grunt.template.today("yyyy-mm-dd") %> */\n',
      report: 'min',
      mangle: true,
      compress: true
    },
    files: '<%= uglify.dev.files %>'
  }
},

Use of a rename function for configuring file srcs and dests

I was really struggling to come up with src/dest configuration for Uglify that pushed all of my source files into a single minified dest file. To be fair, this is trivially easy in the common case, as you can simply use files: { ‘dest_path’: ['file_1', 'file2'] }.

However I have my list of source files in <%= conf.minify_js_files %> and the paths therein do not include the root app/ directory, because this works out for the best in various other Grunt tasks (not shown) where I use cwd in the files block to supply that root dir. Unfortunately, without my custom rename function, a separate dest is calculated for each src file, by concatenating the src path with the dest, so instead of one minified JS file we get lots of individual files sprayed over all sorts of unintended locations. The trivial rename function I’ve used overrides those calculated dest locations to our originally intended single dest. Where different srcs have the same dest, the grunt-contrib-uglify plugin has the good sense to assume you want to merge their output. And hence we get the result we want. To be clear, this is only complicated because I want to use cwd in the file config rather than using the simpler approach.

Re-using files blocks in multiple places

You can share common options amongst multiple targets by putting them at the top level of the task and overriding/extending as required in the specific targets. However you can’t do this when specifying files. In my case I want to specify the same files for both dev and prod Uglify targets, so I specify them in full for dev then use Grunt’s templating facility to refer to them from prod with files: ‘<%= uglify.dev.files %>’.

Theoretically I could have put the definition in the conf block at the top, but it’s specific to Uglify and only used there so I prefer it to be local to the Uglify task. It seems obvious now that I can refer back to it like this, but at the time I struggled to see how to achieve it. I think I had a blind spot for the generic nature of the templating mechanism, having only used it in a rigid way for config previously, and still being very new to Gruntfiles.

Uglify may break JS debugging

I found that my minified JS files could not be successfully debugged in any browsers. I could see the original un-minified code thanks to the source maps and I could set breakpoints, but they usually wouldn’t trigger, or if I did break (e.g. with JS ‘debugger’ command in my code) it was impossible to get variable values. All very frustrating.

Whilst I’m developing I use Grunt’s watch task to serve my files and to auto-process modifications on the fly, so in that mode I turn off all the actual minification features of Uglify and effectively just concatenate the files together into one. Because it’s still the same single file with the same name as in production, I can use identical static HTML to include my JS script. The source maps are still there and allow me to see the individual files in the browser debugger.

by
Sam Carr
on
08/01/14

Getting Sieves Right

The great thing about being wrong is that you get to learn something. In a previous post I went on at length about the the sieve of eratosthenes. Now that I have been enlightened by Melissa O’Neill I must make corrections.

The problem is that Erastosthenes posited a fixed array from which we ‘cross off’ non-primes, the classic algorithm being:

  • Write a list of all integers > 1 up to whatever your maximum n is
  • Declare the first non-eliminated integer as a prime p, and eliminate all multiples of this (staring at p^2 as an optimisation). Note that multiples may be found by incrementing by p. A further optimisation is to stop eliminating at sqrt(n).
  • Repeat for the next non-eliminated integer.

The problem is that fixed list at the start, which rules out generators. The usual implementation for generating infinite sequences of primes stores each p and tests future candidates against them. But this is not what Erastosthenes was talking about. O’Neill calls it the ‘unfaithful sieve’ and notes that

In general, the speed of the unfaithful sieve depends on the number of primes it tries that are not factors of each number it examines, whereas the speed of Eratosthenes’s algorithm depends on the number of (unique) primes that are

As an example (from the paper). If we are looking for primes below 541 (the first 100 primes), and find that 17 is a prime, we start crossing of multiples of 17 at 289 (17^2), crossing off 15 times in total. In contrast, the unfaithful sieve will check all numbers not divisible by 2,3,5,7,11 or 13 for divisibility by 17. This is a total of 99  checks.

In the end, the correct Erastosthenes sieve  is ϴ(n log log n) to find all primes up to n, whereas the unfaithful sieve is  ϴ(n^2/(log n)^2).

O’Neill goes and explores Haskell implementations, but what would a good sieve look like in Clojure? Remember that Clojure, the unfaithful sieve looks like this:

(defn lazy-sieve [s]
  (cons (first s)
    (lazy-seq (lazy-sieve (remove #(zero? (mod % (first s))) (rest s))))))

(defn primes []
  (lazy-sieve (iterate inc 2)))

(take 5 (primes))
;= (2 3 5 7)

It stores the list of eliminated primes in nested calls to lazy-sieve. To turn this into a lazy, infinite Erastosthenes sieve, we need to rather store a map of the next ‘crossings off’, along with their prime factors. Another example from O’Neill: When we discover 17 as a prime, we insert the first ‘crossing off’ (17^2 = 289) in the map of upcoming composites along with it’s prime factors (17 in this case). When we come to consider 289, we discover it is a known multiple of 17, remove 289 from the map, and insert 17+289 = 306. In essence we are building a map of iterators keyed by their current value. As it happens, 306 has prime factors 2, 3 and 17, so when 306 is considered, it is removed and entries inserted for 306 + 2, 306 + 3 and 306 + 17. Each of the iterators at that value is incremented appropriately.

Let’s quickly hack it together. We’re going to store the table of iterators in a plain hash-map, with the ‘crossed off’ values as the key, and a vector of prime factors as value.

;; Given a seq of iterators and a prime, generate a key-value list of
;; the next values of the iterators (as keys) and new lists for the prime factors
(defn gen-iterator-keyvals [iterators prime]
  (mapcat #(list (+ % prime) [%]) iterators))

;; update the iterator-map when a prime is found.
(defn update-iterators [prime iterator-map]
  (let [iterators (apply hash-map (gen-iterator-keyvals (get iterator-map prime) prime))
        basemap (dissoc iterator-map prime)]
    (merge-with concat basemap iterators {(* prime prime) [prime]})))

(defn lazy-erastosthenes [iterator-map [x & xs]]
  (if (contains? iterator-map x)

    ;; if the value is 'crossed-off', it's not a prime so don't cons it. But update the map.
    (lazy-erastosthenes (update-iterators x iterator-map ) xs)

    ;; otherwise chain the value on, and add simply add an entry to the map.
    (cons x (lazy-seq (lazy-erastosthenes (merge-with concat iterator-map {(* x x) [x]}) xs)))))

(defn primes []
  (cons 2 (lazy-seq (lazy-erastosthenes {2 [2]} (iterate inc 2) ))))

(take 10 (primes))
(2 3 5 7 11 13 17 19 23 29)

Performance? Well, the old version had lower constant factors, so the better algorithm only starts to show through at about N=550. As we get to larger numbers, the improvement is clear. For one, it isn’t prone to stack overflows like the original (and the canonical lazy-seq example)! In fact I won’t give graphs here (see the O’Neill paper for some) because I’m the stack overflows are limiting what I can do, but by about N=4000 we’re looking at about an order of magnitude improvement.

by
James Uther
on
31/12/13

CodeMesh 2013 Redux

Last month I attended the CodeMesh conference here in sunny London, along with a couple of my colleagues. Here are my recollections and thoughts.

The venue (Hotel Russel on Russel Square) is a pleasantly rambling, grand old hotel, which hosted a few hundred hardcore geeks fairly well. A couple of the rooms were a bit small and not entirely suited to the task, though the main lecture hall and the exhibition/mingling/eating space alongside worked very well. Food was good and the craft beers on the first night were exactly what is missing from most such parties. Most event organisers don’t do any better than Becks, so the variety of beer here and live, Clojure-generated beats make it stick in the memory.

The content of the conference (I didn’t just go for the beer and food) was tending strongly towards functional programming, new languages and general deep geekery. And for that I applaud it, though I found the quality of the talks variable. Putting it less politely, a few were atrocious, mainly due to scatty content and poor exposition, but there were just enough gems to make it worthwhile. I think I may have chosen badly between the available tracks as my colleagues seemed to fare better. I’ve been to enough conferences to know that this is par for the course, and like a round of golf, it just takes a couple of standouts to make it all worthwhile.

For me, Jafar Husain’s talk “End to End Reactive Programming at Netflix” was a sparkling example, and I have already presented a mini version of it back at base, introducing my colleagues to the Reactive Extensions (Rx) from a JS point of view, and including some nice examples that I found from Bacon.js, an alternative implementation based on the same principles. I strongly recommend looking at Rx for JavaScript if you’re working in the browser, and at the many other ports for use on the server or various UI frameworks. I’ve actually been exposed to the Scala version myself recently as part of the Principles of Reactive Programming course on Coursera, which was nicely timed to bolster my understanding.

More generally, the key themes that stuck out to me, with a heavy helping of personal opinion, were as follows.

  • Scala has a lot of mindshare and people are doing real things with it. It also brings with it a great deal of scepticism from those that think it’s too complicated or too impure as a functional language. I share those feelings, but I’m learning and using it all the same as it may be the best thing available right now and allows me to mix and match paradigms within a single program, bringing pragmatism to problem solving. I spoke to Bruce Tate after his talk on the evolution of programming languages and he reckoned Scala is a good bridge language to a better place that probably doesn’t exist yet (I’m paraphrasing).
  • Haskell maintains the moral high ground and the talk about what they’re doing with it at Facebook was a necessary example of its use in the real world. Having one of the long time Haskell compiler engineers working on the project has clearly helped, and indeed they have made compiler changes to enable their project to succeed. Somehow Haskell leaves me cold however, at least for now. Perhaps because it’s that much more of a stretch away from the JVM and its purity feels more like trouble than freedom. Maybe I’ll get there in the end.
  • People are really trying hard to bring new languages to the table that learn from what has gone before. Rust looks like a nice approach to systems programming and I’d definitely consider it if faced with a problem that looked like it required C or similar.
  • There was a background level of Erlang, but perhaps more interesting was Elixir, a new functional language that runs on the Erlang VM. Dave Thomas was up on the stage with José Valim, Elixir’s creator, waxing hysterical about how he was as excited as he was about Ruby back in the day. Personally I’m not feeling it, perhaps because I’m tending towards statically typed languages recently, having done plenty of Ruby and Groovy in years past.

To summarise, it was a little hit and miss, but allowed me to put my finger on the pulse of this section of the tech community, and to learn some worthwhile lessons. Many of those lessons were of the intangible, hand-wavy sort, but I value them highly as we try to navigate the currents of the fast-moving tech world. Then again, one of the things we learn is that at a higher level it doesn’t move that quickly and a lot of the stuff that’s hot right now has its origins in the 70s and 80s.

by
Sam Carr
on
29/12/13

Documenting an HTTP API with Swagger

I recently tried out Swagger, for documenting an HTTP API. The big win with Swagger is that it provides a sweet HTML UI to browse your API docs and experiment with sending requests and viewing responses, which is a great experience for other developers that are trying to get to grips with your API. Try out their demo of the Swagger UI, for a simple petstore example.

Swagger petstore example - screenshot

Swagger is effectively three things ‘architecturally’:

  • A specification for the JSON files, which contain your API documentation in the abstract
  • Various “generator” tools and libraries (many third party) for producing those JSON files in the first place, either statically or dynamically
  • Swagger UI, a static HTML/JS app which consumes those JSON files and presents the nice UI.

Ideally you use a generator that extracts the API documentation for free, right from your API code. Many such generators are available for various languages and HTTP/REST frameworks, so you may have to do very little to get a free lunch here. However typically you’d expect to use extra annotations to further document the API with the useful human-facing semantic information which isn’t present in the raw code, and further annotations may be required to clarify object serialisation etc. Still, this is a pretty good facsimile of the promised land, where documentation stays in sync with the code and never goes stale!

In our specific case we were working with an app written in Go, so could potentially have used the go-restful library for our REST services, which has Swagger support built into it. However we were already committed to another library that didn’t have that support and being new to Swagger we couldn’t be sure if it was worth switching libraries or wiring up our own swagger integration. We decided to prototype a Swagger solution by hand-crafting the JSON files in the first instance, to see if we (and our users) liked the results. This showed up a particular challenge that is worth covering here.

You can’t do URL hierarchies with static file serving

A typical REST API will have URL hierarchies such as /users (that lists users) /users/fred-smith (details for a specific user) and indeed the Swagger JSON file URLs consumed by Swagger UI are assumed to be in this sort of hierarchy. Swagger UI consumes Swagger JSON files via HTTP: you give it the URL of the main JSON “resource listing” file which provides URLs for the subordinate “API declaration” files. If that resource listing is served from /main, it expects the API declarations to be at /main/user, /main/product etc. and this is hardcoded into the way it constructs URLs. Unfortunately if we want to provide these JSON files by simply serving them via Nginx, straight from disk with no smarts, we’re out of luck as your average filesystem cannot have both a file “main” and a directory “main” in the same parent directory. You just can’t do it, so you can’t serve up a hierarchy like that from static files.

Obviously you could configure your web server more intricately, mapping individual URLs to individual files to construct the hierarchy. This isn’t appealing however, especially as Swagger UI itself can be served statically (it’s just static HTML, JS, CSS etc.) and we are simply including our JSON files within its directory structure. Three simple lines of config in Nginx should be enough, to serve up swagger-ui and our included JSON files:

location /api-docs {
    alias /opt/swagger-ui;
}

The root problem here is that Swagger UI is extremely simplistic about how it interprets paths in the top-level resource listing JSON. It assumes that the paths to the individual API declaration files can simply be concatenated to the resource listing path, as if they are laid out in a pure hierarchy as sub-resources. If the resource listing is at /api-doc.json and it references a path “users.json” then Swagger UI concatenates these and looks for the API declaration at /api-doc.jsonusers.json. This looks especially bad if you have a .json extension and no leading / on the path. By fixing those two problems we get a bit closer but it’s still looking for /api-doc/users and as mentioned above, we can’t have both a file and a directory named “api-doc” in the filesystem so we are stuck. As an aside, losing the file extension is worth doing regardless, as Swagger UI uses the full name as the title for each section of the docs and you really want “users” rather than “users.json” as your heading.

The trick to win the day here is to use a path like “/../users” in the resource listing. Then the concatenated path is /api-doc/../users which is ultimately resolved to just /users. That being the case, we can put our JSON files “api-doc” and “users” in the same directory (even though Swagger likes to consider them as hierarchical) and they will link together correctly. If you do want the API declaration files to be down a level, you could use “/../apis/users” and put them in an “apis” directory one level deeper than the resource listing file. The key here is that we don’t have to have a file and directory with the same name.

by
Sam Carr
on
27/11/13

Search

Categories

You are currently browsing the archives for the Programming category.

Feeds

Archives

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us