technology from back to front

Blog: latest posts

When in Rome

I’ve been trying to integrate js-sequence-diagrams into Trac. I’ve reached the point where I can choose between my sequence diagrams getting rendered, and the rest of the Javascript in Trac working. And it’s all because of an underscore…

There’s a popular library in the Javascript world: underscore. In python _ is used to internationalize a string. Trac have carried this notation into their Javascript. That description makes this sound harmless. It’s anything but: it’s taken me a very long time to work that out.

The lesson here is write idiomatic code. Because Javascript has no module system, a web page really is a place you have to co-operate in by using the local customs. For Trac, it’s not reasonable to claim the namespace all for yourself: it’s main claim is extensibility, and the easiest way to extend it is using Javascript in the browser.

Anyway, I guess I should point out the solution is require.js. It’s not all that complicated to use in this case: we leave all the trac stuff, including jquery, and jquery-ui outside. _ get’s redefined, but just in the scope of sequence-diagram.js. main.js looks like this:

require.config({
  baseUrl: requireBaseUrl,
  shim : {
    'underscore': {
      exports: '_'
    },
    'raphael': {
      exports: 'Raphael'
    },
    'sequence-diagram': {
      deps: [ 'underscore', 'raphael' ]
    }
  },
});

require(["draw-sequence-diagrams"])

draw-sequence-diagrams is my script, which is written as a module:

define(
  // I don't actually refer to this, I need it's side effects on jQuery
  ["sequence-diagram"],
  function(sd) {
    // jQuery comes from Trac - it's global
    jQuery(document).ready(function() {
      jQuery(".sequence-diagram").sequenceDiagram({theme: 'simple'})
    })
 })

I have to get trac to calculate the base URL, so I add it as a variable. Here’s the python for the Trac Extension:

import pkg_resources

from genshi.core import Markup

from trac.core import implements, Component
from trac.web.chrome import ITemplateProvider, add_script, add_script_data
from trac.wiki.api import IWikiMacroProvider
from trac.web.api import IRequestFilter
from trac.util.translation import N_

MACRO_DESCRIPTION = """
Include a sequence diagram in the Wiki. We use a javascript
extension to actually render the sequence diagram.
"""


SEQUENCE_DIAGRAM_TEMPLATE = """
<div class="sequence-diagram">{text}</div>
"""


class JsSequenceDiagrams(Component):

    implements(ITemplateProvider,IWikiMacroProvider,IRequestFilter)
       
    # ITemplateProvider methods

    def get_htdocs_dirs(self):
        return [ ( 'js-sequence-diagrams',
                   pkg_resources.resource_filename(
                    'jssequencediagrams',
                    'templates') ) ]

    def get_templates_dirs(self):
        return []

    # IWikiMacroProvider methods

    def get_macros(self):
        yield 'SequenceDiagram'

    def get_macro_description(self, name):
        return 'messages', N_(MACRO_DESCRIPTION)

    def expand_macro(self, formatter, name, text, args):
        return Markup(SEQUENCE_DIAGRAM_TEMPLATE.format(text=text))

    def pre_process_request(self, req, handler):
        return handler

    def post_process_request(self, req, template, data, content_type):
        get = self.env.config.get
        add_script_data(req, { 'requireBaseUrl': '%s/js-sequence-diagrams' % req.href.chrome() })
        add_script(req, 'js-sequence-diagrams/require.js', 'text/javascript')
        add_script(req, 'js-sequence-diagrams/main.js', 'text/javascript')
        return (template, data, content_type)

You can see that I load main.js, rather than letting require.js do it for me. That’s because otherwise, I wouldn’t be able to use trac’s add_script, which can’t add the extra attribute I’d need for that.

Also note I need to get trac to calculate require’s base URL, and I pass that in as a javascript variable.

You will need to look elsewhere for how to include resources in your Trac extension.

by
david
on
31/03/14

Getting back into front-end web development

I’ve been working on a small SPA (Single Page Application) – just HTML, CSS and JavaScript statically served and doing its thing entirely in the browser. I learned a great deal throughout the project, but here are some of the things that strike me as most valuable.

Get a good workflow going

I used Grunt to setup a nice build system that is mostly a joy during development. It took a while to evolve my Gruntfile, but now when I edit a file, the results are immediately refreshed in the browser (I don’t even have to hit cmd-R). I can deploy to S3 test, staging and live sites with a single command that takes about 3 seconds. My SASS files are compiled down to minified CSS, my JS is minified with source maps etc.

The hardest part of using Grunt is figuring out how to configure it and its many contrib plugins. I could have done with a definitive reference or perhaps I could have used Yeoman to give me an out of the box solution. However I recognised that I was always going to have to figure out the guts of Grunt so I think I really was better off bespoking it from the start. I’m glad I did as now I have a tight setup that does precisely what I want and that I understand completely.

Now it seems there is a new kid on the scene, Gulp – nicely introduced in this tutorial blog post. I will definitely be looking closely at that for my next project, with the piping approach looking like the key step beyond Grunt, along with nicer syntax. I’d also look at Browserify, for a nicer way to piece together the JS bits.

Learn JavaScript properly

To the uninitiated, JavaScript is fairly surprising in many subtle ways, and though I can grok the prototype-based inheritance fairly easily, the scoping rules caught me out repeatedly. This was especially the case as I tried to create JQuery plugins with private methods and state. Eventually a simple old article by grand-daddy of JavaScript writing Douglas Crockford gave me the vital clues I was missing.

Really I should just read his book, and I would recommend that anyone else doesn’t just attempt to learn JavaScript as they go, but takes some time to pro-actively figure out the core concepts – it will pay off in short order.

jQuery is non-negotiable

And the award for most indispensable library goes to: jQuery. Seriously, it should be baked into the browsers or the ECMAScript standard. The nicest thing about it is I can pretty much just guess at the API and be right most of the time, though reading the docs usually reveals new conveniences that go further than I even imagined.

Browser quirks can be a living nightmare

JavaScript itself is fairly reliable, especially with judicious use of mature libraries like jQuery that paper over the cross-browser DOM cracks. CSS in complicated scenarios is where it all seems to go wrong however.

It’s amazing how broken/different some browsers are. Here are just a few highlights, though every day brought tens of new oddities and associated workarounds.

  • Mobile Safari on iOS 7 reports the viewport height inconsistently (depending on how you access it) leading to bad layout and horrible JavaScript workarounds.
  • Use of -webkit-overflow-scrolling:touch causes the hardware accelerated renderer to kick in, resulting in various flickers, flashes and flitches with content not rendering.
  • IE 10 on Windows 8 shows back/forward overlays at the left/right of the screen when your mouse moves near them, obscuring links in those locations.
  • Chrome running on Retina Macs suffers from strange graphical glitches when running CSS Animations, but is fine with CSS Transitions. However other browsers/platforms really need CSS Animations to get smooth, hardware accelerated movement. In my case it was necessary to implement both approaches and select using browser detection.
by
Sam Carr
on
06/03/14

Two Magnolias, one container

We are using Magnolia in a number of projects here at LShift. I have been feeling that Magnolia has a simple way to do most things, but often there are a number of other plausible alternatives that gradually lead you into wasting enormous amounts of time.

Here I want to present a simple way to get both author and public instances of Magnolia running in your dev environment in the same container. It may seem very obvious. If so, good. This was not the first way I tried, and it cost me a lot of time.

We will be aiming for:

  1. Easily deploying Magnolia onto a stage or production environment — one file, one or two configuration parameters only.
  2. Making it easy for a tester to launch local public and author instances of Magnolia that talk to each other correctly.
  3. Making it easy for a developer to debug Magnolia, having both instances running under the control of the IDE.

Preconditions

I will be assuming that you have a parent project with a child project that represents your webapp. I also will assume that you have copied the contents of src/main/webapp/WEB-INF/config from the magnolia-empty-webapp project into your own webapp project. The source for this is in the ce-bundle at git.magnolia-cms.com/gitweb/?p=ce-bundle.pub.git, but assuming you have magnolia-empty-webapp as a dependency (as recommended) you should be able to pick it up from your target directory.

I will be using Tomcat 7 as Tomcat is recommended by Magnolia and 7 is the latest stable version at the time of writing.

Deploying Magnolia to Stage or Production environments

For deployment to stage or production you don’t want both author and public deployed in the same container, or even on the same machine; so we only need to be able to configure a single running instance to be either author or public.

This is quite simple and well documented. In your webapp project, open your src/main/webapp/WEB-INF/web.xml (that you copied from the empty webapp project as described above) and look for the lines:

  <context-param>
    <param-name>magnolia.initialization.file</param-name>
    <param-value>
      WEB-INF/config/${servername}/${contextPath}/magnolia.properties,
      WEB-INF/config/${servername}/${webapp}/magnolia.properties,
      WEB-INF/config/${servername}/magnolia.properties,
      WEB-INF/config/${contextPath}/magnolia.properties,
      WEB-INF/config/${webapp}/magnolia.properties,
      WEB-INF/config/default/magnolia.properties,
      WEB-INF/config/magnolia.properties
    </param-value>
  </context-param>

You will need to add your own line at the top of the <param-value> section:

      WEB-INF/config/${contextAttribute/instanceName}/magnolia.properties,

Then when you deploy your WAR, you can simply set the instanceName environment variable to magnoliaPublic or magnoliaAuthor depending on what type of instance you want. As you can see from the fragment of web.xml above, this will make the settings in src/main/webapp/WEB-INF/config/magnoliaAuthor/magnolia.properties or src/main/webapp/WEB-INF/config/magnoliaAuthor/magnolia.properties active, respectively. Ultimately you will want to make more magnolia.properties files in more subdirectories (called, perhaps, stageAuthor, productionPublic and so on) with appropriate settings for those environments and you can simply make instanceName refer to the appropriate subdirectory.

Local Magnolia from the command line

Now, it would seem plausible that this method can be made to make your local testing environment work. Plausible, but wrong. This is the difficult way. You’ll start writing your context.xml files, then you’ll need a server.xml file, then before you know it you’ll be building your own Tomcat so that you can manage it all.

The “secret” is to use the fact that the web.xml already refers to the context path, in the form of the line:

      WEB-INF/config/${contextPath}/magnolia.properties,

(as well as in another line which we won’t concern ourselves with). This means that, instead using an environment variable, you can deploy the same WAR file to two different context paths and Magnolia will set itself up differently for each. And if you choose the paths /magnoliaAuthor and /magnoliaPublic you will automatically pick up the properties files provided by the empty webapp and all will be fine — Magnolia even sets up the author instance to point at http://localhost:8080/magnoliaPublic by default, so you won’t have to configure it yourself!

Well, actually, it’s not all fine. If you try this, you’ll find that one of your instances will refuse to start, complaining that its repository is already locked. Of course, they are trying to use the same repository. Fix this by adding a line similar to the following to magnoliaPublic/magnolia.properties:

magnolia.repositories.home=${magnolia.home}/repositories-public

The name of the subdirectory is not important. Note that, as it stands, this will change where the stage and production deployed Magnolias you configured above store their data. If that bothers you, now might be a good time to make your productionPublic/magnolia.properties and similar files.

So, how do we get that running painlessly so that your tester doesn’t keep asking you how to do it?

Add the Tomcat Maven plugin to your webapp’s pom.xml, and configure it to launch your WAR twice on two different context paths:

      <plugin>
        <groupId>org.apache.tomcat.maven</groupId>
        <artifactId>tomcat7-maven-plugin</artifactId>
        <version>2.2</version>
        <configuration>
          <webapps>
            <webapp>
              <groupId>com.my.group</groupId>
              <artifactId>my-webapp</artifactId>
              <version>1.0-SNAPSHOT</version>
              <type>war</type>
              <asWebapp>true</asWebapp>
              <contextPath>/magnoliaAuthor</contextPath>
            </webapp>
            <webapp>
              <groupId>com.my.group</groupId>
              <artifactId>my-webapp</artifactId>
              <version>1.0-SNAPSHOT</version>
              <type>war</type>
              <asWebapp>true</asWebapp>
              <contextPath>/magnoliaPublic</contextPath>
            </webapp>
          </webapps>
        </configuration>
      </plugin>

Replacing com.my.group and my-webapp with your own webapp’s group and artifact id.

Now you can run your Magnolia simply with:

mvn tomcat7:run-war

For reasons best known to the Tomcat plugin, boring old mvn tomcat7:run doesn’t work — deploying only one Magnolia in its default location. Sorry.

The instances are available, of course, at http://localhost:8080/magnoliaAuthor and http://localhost:8080/magnoliaPublic.

Local Magnolia from your IDE

Now you’re on the home straight. Here’s how I configure the Tomcat plugin in Eclipse:

Firstly, you need to get Eclipse to know about Tomcat 7. The foolproof way to do this is as follows: Window -> Preferences -> Server -> Runtime Environments -> Add… -> Apache Tomcat v7.0 -> Next. Now give it a location that is writable by you in the “Tomcat installation directory” box and click “Download and Install…”; using your pre-existing Tomcat might not work if it isn’t laid out in the way Eclipse expects. Now Finish and open the Servers view.

You can now add a new Tomcat 7 server and double-click on it. Tick “Publish module contexts to separate XML files”, set the start timeout to something large like 480 seconds, and in the modules tab add your webapp project twice; once with the path /magnoliaAuthor and once with the path /magnoliaPublic.

Now you can launch and debug your two instances of Magnolia from within your IDE!

by
Tim Band
on
03/03/14

Predicting and controlling correlations in differentials of addition mod 2^{n}

This is a paper I wrote in collaboration with Scott Fluhrer in 2005. It was not accepted for FSE 2006; it would have been better if I hadn’t waited until 2014 to make it public, but better late than never.

It arose from a discovery I made when developing attacks on Salsa20 for “Truncated differential cryptanalysis of five rounds of Salsa20”. Several of the attacks were twice as effective as my calculations showed they should have been. I remarked in the paper “By experiment, we have even determined a few differential trails whose probability appears to be twice as high as their weight would suggest—this is presumably because of problems with the independence assumption”. Digging deeper, I discovered why these trails did not behave independently, and developed an algorithm for correctly predicting the probability that the trail holds; this algorithm works across a wide range of “ARX” (addition, rotation, XOR) cryptographic primitives. The obvious person to talk to about these ideas was Cisco’s Scott Fluhrer, whose expertise in differential cryptanalysis led to him breaking the cipher in my first publication, Mercy. Scott pointed me to the ARX MAC “Michael”, and we found that the same technique could be used to analyse an anomaly in Michael.

In all likelihood this is just a historical curiosity; the right place to look now would be the ARX toolkit and associated papers.

Paul Crowley, Scott Fluhrer, Predicting and controlling correlations in differentials of addition mod 2^{n} (PDF).

by
Paul Crowley
on
19/02/14

A study in Scala

Cards on the table: I like Scala. Right now it’s my go-to general purpose programming language, but I know that many people have a dim opinion of it, or appreciate its positives but are heavily conflicted by its negatives. Actually I do put myself in that latter camp, though I think I’ve got deep enough into learning the language that I can get past most of those negatives. Here I present a classic tale of “life with Scala”.

The problem

Scala generally has collection methods for most things you can think of, so I was pleased but not surprised to find inits and tails methods, that list all of the prefixes and suffixes respectively. These methods return an iterator, hence the use of toList below, to actually see the results. For the rest of this post, we’ll just consider inits, though tails is trivially similar.

scala> List(1,2,3).inits
res1: Iterator[List[Int]] = non-empty iterator

scala> List(1,2,3).inits.toList
res2: List[List[Int]] = List(List(1, 2, 3), List(1, 2), List(1), List())

scala> List(1,2,3).tails.toList
res3: List[List[Int]] = List(List(1, 2, 3), List(2, 3), List(3), List())

The results include the empty collection, but as it happens that doesn’t suit my needs. I want a version that doesn’t return the empty list.

What to do?

Options include:

  • Call the existing inits method then strip off the last item. However since we get an iterator this isn’t quite so trivial – we can’t just do inits.init
  • Write a new custom method to do just what I want and return a collection, not an iterator.

I decided, in the spirit of Scala, that I would add a nonEmptyInits method to ‘all collections’ (and I’m deliberately vague here) via an implicit conversion, so that it can be called just as easily as inits. It would call inits but wrap the iterator to exclude the final element. Easy peasy, but the rabbit hole loomed ahead of me.

Because I was over-engineering and trying to be purist (this was hobby code) I wanted to implement my new method for the most generic type reasonable. Figuring out what type that was challenged me rather. The Scala collections API has a fairly complicated hierarchy. This is somewhat diagrammed and explained in the documentation but that doesn’t show all the interim traits and classes that are in the guts of the implementation such as TraversableLike, GenTraversable, GenTraversableLike and 10+ others relating just to Traversable. The API is setup to support some very clever facilities, like generic collection methods typically returning collections of the same type you started with. Apparently most libraries can’t/don’t do this. However it made it hard for me to find the right place to start. Ultimately I settled on Traversable, since inits itself is defined in TraversableLike, but adding to that directly seemed like more trouble than it was worth.

implicit class TraversableExtras[T](t: Traversable[T]) {
  def nonEmptyTails: Iterator[Traversable[T]] = {
    val bufferedInits = t.inits.buffered
    new Iterator[Traversable[T]] {
      def hasNext() = bufferedInits.head.nonEmpty
      def next() = bufferedInits.next()
    }
  }
}

My solution above wraps the iterator with a buffered version, so that I can peek at head and determine whether to end early or not. This is fairly nice and simple and I’m quite happy with it. It means we can do the following, via the magic of implicit conversions.

scala> List(1,2,3,4).nonEmptyTails.toList
res7: List[Traversable[Int]] = List(List(1, 2, 3, 4), List(1, 2, 3), List(1, 2), List(1))

But Arrays mess it all up

This is all looking very promising and straightforward, but it took a fair bit of banging my head against the desk to figure out why the compiler wouldn’t let me call it on an Array.

scala> Array(1,2,3).nonEmptyTails
:9: error: value nonEmptyTails is not a member of Array[Int]

To cut a long story short, though we can seamlessly treat Array as a Traversable, Array is actually somewhat special. It maps to Java arrays for implementation reasons, but supporting generics, and with an implicit conversion to WrappedArray, which is a Seq (and hence a Traversable) to provide all the collection methods we’re used to. So far so good, but it turns out that the compiler cannot apply more than one implicit conversion at a time, as the search space would explode. Hence it cannot convert from Array to WrappedArray to my TraversableExtras. A solution in such a case, where we want two conversions, is to explicitly cast the Array, to make the first conversion explicit, then the compiler does the rest.

scala> (Array(1,2,3): Traversable[Int]).nonEmptyTails
res10: Iterator[Traversable[Int]] = non-empty iterator

Wrap-up

Finally I got where I wanted, even if that cast is a bit irritating, and I learned a lot of useful things in the process – all part of the journey of familiarity with any language. But it struck me as soon as I got a bit lost in the Scala collections hierarchy that I was on a mission with which I am all too familiar. It was at that point that I decided I would probably blog about the experience and started taking notes!

I still like Scala, but I’m very well aware that this is the price I pay, and I’m not surprised that others find it too high a price.

Just to be clear…

I realise by the way that this was all a fools errand in the first place. I deliberately went down the path described above as an explicit learning exercise because I wanted to get more experience with the collections API, iterables, implicits etc. It is possible to achieve the result more directly with a filter:

scala> Array(1,2,3).inits.filter(_.nonEmpty).toList
res16: List[Array[Int]] = List(Array(1, 2, 3), Array(1, 2), Array(1))

It’s not quite as pithy as having a nonEmptyTails method, but in all other senses it’s probably superior.

by
Sam Carr
on
10/02/14

Facebook android security fail

I wrote about the Ocado android applications ridiculous privileges  list a while back. Facebook has reminded me of it: Facebook app now reads your smartphone’s text messages? THE TRUTH.

Facebook want to be able to capture the two factor auth message straight from your text messages, so they ask for access to all your text messages.

What should happen, is that there’s a two factor auth agent app, which has access to your text messages. Other apps register with it – specifying that they wish to receive text messages from a particular sender. When they do this, the agent asks the user if that’s allowed. That’s better, because the agent is a small and simple app, and if lots of apps share it, we can spend more on it, further cutting down on faults.

Because this app doesn’t exist yet, someone trusted needs to write it. Facebook could write it but why would they do that? most users user’s wouldn’t understand the difference. Lot’s of people argue that the someone trusted should be Google, and the agent should be part of Android.

I think this is a great opportunity for a security brand to build it’s reputation: it can create a suite of applications like this, and then offer to endorse apps which use them, and follow a set of guidelines.

This could be implemented with something as simple as a trademark on the apps page, but it’s preferable that endorsements are cryptographically signed certificates, and android displays endorsements before installation. App developers would pay for the endorsement – either by paying for the review, or by a share of revenue.

This is a much more positive role for a security brand than offering virus checking, since it will significantly reduce zero day exploits, by increasing software quality. It’s also the only way open, since Google aren’t allowing third parties to block app installation. Building a brand seems to be something the industry knows how to do, we should harness that to improve software quality.

I need to add a couple of things to this. One is an oversight, and the other came to my attention just after writing the above:

The Google+ app recently asked to read your text messages – it incorporates them into the user interface. I don’t know if Google are slurping text messages as a result, but the security implications are exactly the same.

Facebook have released a library designed to help app developers safeguard users data on Android devices using cryptography. I haven’t looked at it to asses it’s efficacy, but as I’ve noted here before, Android doesn’t provide access to the phones hardware key storage mechanisms on any phone I’m aware of, so it’s not obviously possible for this library to be effective.

 

by
david
on
05/02/14

UX design in Agile projects

We’ve been making a habit of collaboration with partners for what seems like forever, so why is it that we find it hard to get the formula right with front-end designers? Are we a particularly fussy bunch, or is there something more systematic going on?

Recently we’ve been working with the Agile Research Network who were interested in researching this area and its been useful to have someone external observing projects and documenting conclusions.

The Agile mantra of avoiding BDUF (Big Design Up Front), fails to provide specific guidance over what constitutes “Enough DUF” and guidelines are helpful. Some of what was highlighted was obvious, some less so, here’s a sample:

  • Design prototypes, or demonstrators that look too perfect are dangerous
  • Pixel-perfect UX designs will increase resistance to change
  • Prioritisation and de-scoping renders pixel-perfect designs unnecessary
  • Good UX, like good technical design should be modular
  • Some issues with the design will only be found once you start implementing
  • Get UX and technical designers collaborating on each others’ decisions
  • Consider using an analyst to work closely with both teams documenting requirements conclusions from their work

An initial white paper from the ARN was presented at the Agile Conference in October 2013, a more in depth research paper is under preparation as the moment.

 

 

 

by
mike
on
15/01/14

Grunt uglify file specs

I struggled a bit finding relevant examples of Gruntfile configuration for Uglify, so having solved a few specific problems myself, here’s what I came up with.

This is just a snippet from the whole Gruntfile of course, and contains half-decent comments already, though I’ll provide some extra explanations below to point out the most interesting bits.

// Variables used internally within this config.
conf: {
  app: 'app',
  dist: 'dist',
  // Just our own custom scripts.
  script_files: ['scripts/*.js'],
  // All scripts that should be minified into final result.
  // Ordering is important as it determines order in the minified output and hence load order at runtime.
  // We don't include jquery (though we could) as it's better to get it from Google where possible.
  minify_js_files: [
      'scripts/vendor/modernizr/modernizr.custom.js',
      '<%= conf.script_files %>',
      'scripts/polyfills/**/*.js']
},

uglify: {
  options: {
    banner: '/*! <%= pkg.name %> <%= grunt.template.today("yyyy-mm-dd") %> */\n',
    sourceMap: '<%= conf.dist %>/scripts/source.map.js',
    sourceMapRoot: '/scripts',
    sourceMappingURL: '/scripts/source.map.js',
    sourceMapPrefix: 2
  },
  // For dev, effectively just concatenate all the JS into one file but perform no real minification.
  // This means that the HTML is the same for dev and prod (it just loads the single .js file) but
  // debugging in the browser works properly in the dev environment. It should work even when fully
  // minified, given the source maps, but practice shows that it doesn't.
  dev: {
    options: {
      report: false,
      mangle: false,
      compress: false,
      beautify: true
    },
    files: [{
      expand: true,
      cwd: '<%= conf.app %>',
      src: '<%= conf.minify_js_files %>',
      dest: '<%= conf.dist %>/scripts/main.min.js',
      // Because we want all individual sources to go into a single dest file, we need to use this
      // rename function to ensure all srcs get the same dest, otherwise each would get a separate
      // dest created by concatenting the src path onto dest path.
      rename: function(dest, src) { return dest; }
    }]
  },
  prod: {
    options: {
      banner: '/*! <%= pkg.name %> <%= grunt.template.today("yyyy-mm-dd") %> */\n',
      report: 'min',
      mangle: true,
      compress: true
    },
    files: '<%= uglify.dev.files %>'
  }
},

Use of a rename function for configuring file srcs and dests

I was really struggling to come up with src/dest configuration for Uglify that pushed all of my source files into a single minified dest file. To be fair, this is trivially easy in the common case, as you can simply use files: { ‘dest_path’: ['file_1', 'file2'] }.

However I have my list of source files in <%= conf.minify_js_files %> and the paths therein do not include the root app/ directory, because this works out for the best in various other Grunt tasks (not shown) where I use cwd in the files block to supply that root dir. Unfortunately, without my custom rename function, a separate dest is calculated for each src file, by concatenating the src path with the dest, so instead of one minified JS file we get lots of individual files sprayed over all sorts of unintended locations. The trivial rename function I’ve used overrides those calculated dest locations to our originally intended single dest. Where different srcs have the same dest, the grunt-contrib-uglify plugin has the good sense to assume you want to merge their output. And hence we get the result we want. To be clear, this is only complicated because I want to use cwd in the file config rather than using the simpler approach.

Re-using files blocks in multiple places

You can share common options amongst multiple targets by putting them at the top level of the task and overriding/extending as required in the specific targets. However you can’t do this when specifying files. In my case I want to specify the same files for both dev and prod Uglify targets, so I specify them in full for dev then use Grunt’s templating facility to refer to them from prod with files: ‘<%= uglify.dev.files %>’.

Theoretically I could have put the definition in the conf block at the top, but it’s specific to Uglify and only used there so I prefer it to be local to the Uglify task. It seems obvious now that I can refer back to it like this, but at the time I struggled to see how to achieve it. I think I had a blind spot for the generic nature of the templating mechanism, having only used it in a rigid way for config previously, and still being very new to Gruntfiles.

Uglify may break JS debugging

I found that my minified JS files could not be successfully debugged in any browsers. I could see the original un-minified code thanks to the source maps and I could set breakpoints, but they usually wouldn’t trigger, or if I did break (e.g. with JS ‘debugger’ command in my code) it was impossible to get variable values. All very frustrating.

Whilst I’m developing I use Grunt’s watch task to serve my files and to auto-process modifications on the fly, so in that mode I turn off all the actual minification features of Uglify and effectively just concatenate the files together into one. Because it’s still the same single file with the same name as in production, I can use identical static HTML to include my JS script. The source maps are still there and allow me to see the individual files in the browser debugger.

by
Sam Carr
on
08/01/14

Getting Sieves Right

The great thing about being wrong is that you get to learn something. In a previous post I went on at length about the the sieve of eratosthenes. Now that I have been enlightened by Melissa O’Neill I must make corrections.

The problem is that Erastosthenes posited a fixed array from which we ‘cross off’ non-primes, the classic algorithm being:

  • Write a list of all integers > 1 up to whatever your maximum n is
  • Declare the first non-eliminated integer as a prime p, and eliminate all multiples of this (staring at p^2 as an optimisation). Note that multiples may be found by incrementing by p. A further optimisation is to stop eliminating at sqrt(n).
  • Repeat for the next non-eliminated integer.

The problem is that fixed list at the start, which rules out generators. The usual implementation for generating infinite sequences of primes stores each p and tests future candidates against them. But this is not what Erastosthenes was talking about. O’Neill calls it the ‘unfaithful sieve’ and notes that

In general, the speed of the unfaithful sieve depends on the number of primes it tries that are not factors of each number it examines, whereas the speed of Eratosthenes’s algorithm depends on the number of (unique) primes that are

As an example (from the paper). If we are looking for primes below 541 (the first 100 primes), and find that 17 is a prime, we start crossing of multiples of 17 at 289 (17^2), crossing off 15 times in total. In contrast, the unfaithful sieve will check all numbers not divisible by 2,3,5,7,11 or 13 for divisibility by 17. This is a total of 99  checks.

In the end, the correct Erastosthenes sieve  is ϴ(n log log n) to find all primes up to n, whereas the unfaithful sieve is  ϴ(n^2/(log n)^2).

O’Neill goes and explores Haskell implementations, but what would a good sieve look like in Clojure? Remember that Clojure, the unfaithful sieve looks like this:

(defn lazy-sieve [s]
  (cons (first s)
    (lazy-seq (lazy-sieve (remove #(zero? (mod % (first s))) (rest s))))))

(defn primes []
  (lazy-sieve (iterate inc 2)))

(take 5 (primes))
;= (2 3 5 7)

It stores the list of eliminated primes in nested calls to lazy-sieve. To turn this into a lazy, infinite Erastosthenes sieve, we need to rather store a map of the next ‘crossings off’, along with their prime factors. Another example from O’Neill: When we discover 17 as a prime, we insert the first ‘crossing off’ (17^2 = 289) in the map of upcoming composites along with it’s prime factors (17 in this case). When we come to consider 289, we discover it is a known multiple of 17, remove 289 from the map, and insert 17+289 = 306. In essence we are building a map of iterators keyed by their current value. As it happens, 306 has prime factors 2, 3 and 17, so when 306 is considered, it is removed and entries inserted for 306 + 2, 306 + 3 and 306 + 17. Each of the iterators at that value is incremented appropriately.

Let’s quickly hack it together. We’re going to store the table of iterators in a plain hash-map, with the ‘crossed off’ values as the key, and a vector of prime factors as value.

;; Given a seq of iterators and a prime, generate a key-value list of
;; the next values of the iterators (as keys) and new lists for the prime factors
(defn gen-iterator-keyvals [iterators prime]
  (mapcat #(list (+ % prime) [%]) iterators))

;; update the iterator-map when a prime is found.
(defn update-iterators [prime iterator-map]
  (let [iterators (apply hash-map (gen-iterator-keyvals (get iterator-map prime) prime))
        basemap (dissoc iterator-map prime)]
    (merge-with concat basemap iterators {(* prime prime) [prime]})))

(defn lazy-erastosthenes [iterator-map [x & xs]]
  (if (contains? iterator-map x)

    ;; if the value is 'crossed-off', it's not a prime so don't cons it. But update the map.
    (lazy-erastosthenes (update-iterators x iterator-map ) xs)

    ;; otherwise chain the value on, and add simply add an entry to the map.
    (cons x (lazy-seq (lazy-erastosthenes (merge-with concat iterator-map {(* x x) [x]}) xs)))))

(defn primes []
  (cons 2 (lazy-seq (lazy-erastosthenes {2 [2]} (iterate inc 2) ))))

(take 10 (primes))
(2 3 5 7 11 13 17 19 23 29)

Performance? Well, the old version had lower constant factors, so the better algorithm only starts to show through at about N=550. As we get to larger numbers, the improvement is clear. For one, it isn’t prone to stack overflows like the original (and the canonical lazy-seq example)! In fact I won’t give graphs here (see the O’Neill paper for some) because I’m the stack overflows are limiting what I can do, but by about N=4000 we’re looking at about an order of magnitude improvement.

by
James Uther
on
31/12/13

CodeMesh 2013 Redux

Last month I attended the CodeMesh conference here in sunny London, along with a couple of my colleagues. Here are my recollections and thoughts.

The venue (Hotel Russel on Russel Square) is a pleasantly rambling, grand old hotel, which hosted a few hundred hardcore geeks fairly well. A couple of the rooms were a bit small and not entirely suited to the task, though the main lecture hall and the exhibition/mingling/eating space alongside worked very well. Food was good and the craft beers on the first night were exactly what is missing from most such parties. Most event organisers don’t do any better than Becks, so the variety of beer here and live, Clojure-generated beats make it stick in the memory.

The content of the conference (I didn’t just go for the beer and food) was tending strongly towards functional programming, new languages and general deep geekery. And for that I applaud it, though I found the quality of the talks variable. Putting it less politely, a few were atrocious, mainly due to scatty content and poor exposition, but there were just enough gems to make it worthwhile. I think I may have chosen badly between the available tracks as my colleagues seemed to fare better. I’ve been to enough conferences to know that this is par for the course, and like a round of golf, it just takes a couple of standouts to make it all worthwhile.

For me, Jafar Husain’s talk “End to End Reactive Programming at Netflix” was a sparkling example, and I have already presented a mini version of it back at base, introducing my colleagues to the Reactive Extensions (Rx) from a JS point of view, and including some nice examples that I found from Bacon.js, an alternative implementation based on the same principles. I strongly recommend looking at Rx for JavaScript if you’re working in the browser, and at the many other ports for use on the server or various UI frameworks. I’ve actually been exposed to the Scala version myself recently as part of the Principles of Reactive Programming course on Coursera, which was nicely timed to bolster my understanding.

More generally, the key themes that stuck out to me, with a heavy helping of personal opinion, were as follows.

  • Scala has a lot of mindshare and people are doing real things with it. It also brings with it a great deal of scepticism from those that think it’s too complicated or too impure as a functional language. I share those feelings, but I’m learning and using it all the same as it may be the best thing available right now and allows me to mix and match paradigms within a single program, bringing pragmatism to problem solving. I spoke to Bruce Tate after his talk on the evolution of programming languages and he reckoned Scala is a good bridge language to a better place that probably doesn’t exist yet (I’m paraphrasing).
  • Haskell maintains the moral high ground and the talk about what they’re doing with it at Facebook was a necessary example of its use in the real world. Having one of the long time Haskell compiler engineers working on the project has clearly helped, and indeed they have made compiler changes to enable their project to succeed. Somehow Haskell leaves me cold however, at least for now. Perhaps because it’s that much more of a stretch away from the JVM and its purity feels more like trouble than freedom. Maybe I’ll get there in the end.
  • People are really trying hard to bring new languages to the table that learn from what has gone before. Rust looks like a nice approach to systems programming and I’d definitely consider it if faced with a problem that looked like it required C or similar.
  • There was a background level of Erlang, but perhaps more interesting was Elixir, a new functional language that runs on the Erlang VM. Dave Thomas was up on the stage with José Valim, Elixir’s creator, waxing hysterical about how he was as excited as he was about Ruby back in the day. Personally I’m not feeling it, perhaps because I’m tending towards statically typed languages recently, having done plenty of Ruby and Groovy in years past.

To summarise, it was a little hit and miss, but allowed me to put my finger on the pulse of this section of the tech community, and to learn some worthwhile lessons. Many of those lessons were of the intangible, hand-wavy sort, but I value them highly as we try to navigate the currents of the fast-moving tech world. Then again, one of the things we learn is that at a higher level it doesn’t move that quickly and a lot of the stuff that’s hot right now has its origins in the 70s and 80s.

by
Sam Carr
on
29/12/13

« Newer Posts Older Posts »

Search

Categories

Feeds

Archives

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us