technology from back to front

Using the syntax tree in Go

The Go programming language provides an abstract syntax tree, parser and pretty printer as libraries written in Go. This enables you to mangle your Go code by writing code in Go – this isn’t quite lisp macros but it is a nice facility.

Here is some example code that I will transform by manipulating the AST.

package simple

import (
  "fmt"
)

func A(val int) int {
  fmt.Println("Checking value")
  if val > 0 {
    return 1
  }
  return 0
}

func b(val int) int {
  return -1
}

The Go parser and printer package are the key to performing the transformation, we can parse a Go source file and print it out again with these lines of code:

fset := token.NewFileSet()
file, err := parser.ParseFile(fset, "test_data/simple.go", nil, 0)
if err != nil {
  // Whoops!
}
printer.Fprint(os.Stdout, fset, file)

The fset variable holds tokenisation information about the file being parsed. The file variable is the root node of the AST. The final line of code prints the AST to the standard output, so in the current extract it will just echo its input.

Now the ast package provides a walk method to traverse the AST using a Visitor. We can define a simple visitor to make all functions public by uppercasing the function name.

ast.Walk(new(FuncVisitor), file)

type FuncVisitor struct {
}

func (v *FuncVisitor) Visit(node ast.Node) (w ast.Visitor)  {
  switch t := node.(type) {
  case *ast.FuncDecl:
    t.Name =  ast.NewIdent(strings.Title(t.Name.Name))
  }

  return v
}

The type defined here is just an empty struct since I don’t currently need to hold any data in my visitor. All this visitor does is match on the AST node for a function declaration and then change its name so that it is now exported by the package. The visitor returns itself and the Walk method will carry on with a depth first descent of the AST, if nil is returned the walk will stop.

Next, I’ll present a slightly more complicated case. This time I will add an import to the source file.

type ImportVisitor struct{}

func (i *ImportVisitor) Visit(node ast.Node) (w ast.Visitor) {
  switch t := node.(type) {
  case *ast.GenDecl:
    if t.Tok == token.IMPORT {
      newSpecs := make([]ast.Spec, len(t.Specs)+1)
      for i, spec := range t.Specs {
        newSpecs[i] = spec
      }
      newPackage := &ast.BasicLit{token.NoPos, token.STRING, []byte("\"fandango\"")}
      newSpecs[len(t.Specs)] = &ast.ImportSpec{nil, nil, newPackage, nil}
      t.Specs = newSpecs
    }
    return nil
  }

  return i
}

This visitor copies any existing imports to a new array and adds one additional import, the additional imports are specified as types from the ast package such as ast.ImportSpec and ast.BasicLit. Once the imports have been modified nil is returned so that the Walk is terminated.

If we run both walks and pretty print the results we get this:

package simple

import (
 "fmt"
   "fandango"
)

func A(val int) int {
    fmt.Println("Checking value")
   if val > 0 {
        return 1
    }
   return 0
}

func B(val int) int {
  return -1
}

So by combining multiple walks you can transform the source code in arbitrary ways, for example the gofix tool works in this way to refactor code as the language specification evolves. This sort of code would also be highly useful when creating all sorts of source code refactoring and analysis tools.

by
tim
on
30/04/11
  1. Have you seen Russ Cox’s PhD thesis? I agree that this aspect of Go is very interesting to watch develop.

 
 


× 2 = two

2000-14 LShift Ltd, 1st Floor, Hoxton Point, 6 Rufus Street, London, N1 6PE, UK+44 (0)20 7729 7060   Contact us