Go’s Testable Examples under the hood

Hidden introduction to ast and parser packages

Golang’s toolchain implements feature called Testable Examples. If name doesn’t tell you much I strongly recommend to read first “Testable Examples in Go” as a gentle introduction. Throughput this post we’ll see what underpins the whole solution and how to build its simplified version.

Let’s see how Testable Examples work:

upper_test.go:

package main
import (
"fmt"
"strings"
)
func ExampleToUpperOK() {
fmt.Println(strings.ToUpper("foo"))
// Output: FOO
}
func ExampleToUpperFail() {
fmt.Println(strings.ToUpper("bar"))
// Output: BAr
}
> go test -v
=== RUN ExampleToUpperOK
--- PASS: ExampleToUpperOK (0.00s)
=== RUN ExampleToUpperFail
--- FAIL: ExampleToUpperFail (0.00s)
got:
BAR
want:
BAr
FAIL
exit status 1
FAIL github.com/mlowicki/sandbox 0.008s

Examples just like test functions are placed in xxx_test.go files but are prefixed with Example instead of Test. Command go test uses comments in special format (Output: something) and compare them against captured data, normally written to stdout. The same comments are used by other tools like godoc to enrich automatically generated documentation.

The question is how go test or godoc are able to extract data from dedicated comments? Is there any secret mechanism in the language making it possible? Or maybe everything can be achieved with well-known constructions?

It turns out that standard library ships elements (spread across few packages) related to parsing source code in Go itself. These tools produce abstract syntax trees and provide access i.e. to comments left by programmer.


Abstract syntax tree (AST)

It’s a representation of elements found in the source code while parsing. Let’s consider a simple expression:

9 / (2 + 1)

AST can be generated with snippet:

expr, err := parser.ParseExpr("9 / (2 + 1)")
if err != nil {
log.Fatal(err)
}
ast.Print(nil, expr)

which outputs:

0 *ast.BinaryExpr {
1 . X: *ast.BasicLit {
2 . . ValuePos: 1
3 . . Kind: INT
4 . . Value: "9"
5 . }
6 . OpPos: 3
7 . Op: /
8 . Y: *ast.ParenExpr {
9 . . Lparen: 5
10 . . X: *ast.BinaryExpr {
11 . . . X: *ast.BasicLit {
12 . . . . ValuePos: 6
13 . . . . Kind: INT
14 . . . . Value: "2"
15 . . . }
16 . . . OpPos: 8
17 . . . Op: +
18 . . . Y: *ast.BasicLit {
19 . . . . ValuePos: 10
20 . . . . Kind: INT
21 . . . . Value: "1"
22 . . . }
23 . . }
24 . . Rparen: 11
25 . }
26 }

Output can be simplified using diagram where actual tree is more visible:

       
(operator: /)
/ \
/ \
(integer: 9) (parenthesized expression)
|
|
(operator: +)
/ \
/ \
(integer: 2) (integer: 1)

Two standard packages are crucial while working with ASTs:

  • parser supplies machinery for parsing source code written in Go
  • ast implements primitives for working with ASTs for code in Go

Normally during lexical analysis comments are removed. There is a special flag to preserve comments and put them into AST — parser.ParseComments:

import (
"fmt"
"go/parser"
"go/token"
"log"
)
func main() {
fset := token.NewFileSet()
f, err := parser.ParseFile(fset, "t.go", nil, parser.ParseComments)
if err != nil {
log.Fatal(err)
}
for _, group := range f.Comments {
fmt.Printf("Comment group %#v\n", group)
for _, comment := range group.List {
fmt.Printf("Comment %#v\n", comment)
}
}
}
3rd parameter to parser.ParseFile is an optional source code passed f.ex. as string or io.Reader. Since I’ve used file from disk it’s set to nil.

t.go:

package main
import "fmt"
// a
// b
func main() {
// c
fmt.Println("boom!")
}

Output:

Comment group &ast.CommentGroup{List:[]*ast.Comment{(*ast.Comment)(0x820262220), (*ast.Comment)(0x820262240)}}
Comment &ast.Comment{Slash:29, Text:"// a"}
Comment &ast.Comment{Slash:34, Text:"// b"}
Comment group &ast.CommentGroup{List:[]*ast.Comment{(*ast.Comment)(0x8202622c0)}}
Comment &ast.Comment{Slash:55, Text:"// c"}

Comment group

It’s a sequence of comments with no elements in between. In above example comments “a” and “b” belong to the same group.

Pos & Position

Position of elements within source code are recorded using Pos type (its more verbose counterpart is Position). It’s a single integer value which encodes information like line or column but Position struct keeps them in separate fields. By adding in outer loop line:

fmt.Printf("Position %#v\n", fset.PositionFor(group.Pos(), true))

program additionally outputs:

Position token.Position{Filename:"t.go", Offset:28, Line:5, Column:1}
Position token.Position{Filename:"t.go", Offset:54, Line:9, Column:2}

Fileset

Positions are calculated relatively to set of parsed files. Every file has assigned disjoint range and each position sits in one of these ranges. In our case we’ve only one but the whole set is required to decode Pos:

fset.PositionFor(group.Pos(), true)

Tree traversal

Package ast provides a convenient function for traversing AST in depth-first order:

ast.Inspect(f, func(n ast.Node) bool {
if n != nil {
fmt.Println(n)
}
return true
})

Since we know how to extract all comments, now it’s time to find all top-level ExampleXXX functions.

doc.Examples

Package doc provides function which does exactly what we need:

package main
import (
"fmt"
"go/doc"
"go/parser"
"go/token"
"log"
)
func main() {
fset := token.NewFileSet()
f, err := parser.ParseFile(fset, "e.go", nil, parser.ParseComments)
if err != nil {
log.Fatal(err)
}
examples := doc.Examples(f)
for _, example := range examples {
fmt.Println(example.Name)
}
}

e.go:

package main
import "fmt"
func ExampleSuccess() {
fmt.Println("foo")
// Output: foo
}
func ExampleFail() {
fmt.Println("foo")
// Output: bar
}

Output:

Fail
Success

doc.Examples doesn’t have any magical skills. It relies on what we already seen so mainly building and traversing abstract syntax tree. Let’s build something similar:

package main
import (
"fmt"
"go/ast"
"go/parser"
"go/token"
"log"
"strings"
)
func findExampleOutput(block *ast.BlockStmt, comments []*ast.CommentGroup) (string, bool) {
var last *ast.CommentGroup
for _, group := range comments {
if (block.Pos() < group.Pos()) && (block.End() > group.End()) {
last = group
}
}
if last != nil {
text := last.Text()
marker := "Output: "
if strings.HasPrefix(text, marker) {
return strings.TrimRight(text[len(marker):], "\n"), true
}
}
return "", false
}
func isExample(fdecl *ast.FuncDecl) bool {
return strings.HasPrefix(fdecl.Name.Name, "Example")
}
func main() {
fset := token.NewFileSet()
f, err := parser.ParseFile(fset, "e.go", nil, parser.ParseComments)
if err != nil {
log.Fatal(err)
}
for _, decl := range f.Decls {
fdecl, ok := decl.(*ast.FuncDecl)
if !ok {
continue
}
if isExample(fdecl) {
output, found := findExampleOutput(fdecl.Body, f.Comments)
if found {
fmt.Printf("%s needs output ‘%s’\n", fdecl.Name.Name, output)
}
}
}
}

Output:

ExampleSuccess needs output ‘foo’
ExampleFail needs output ‘bar’

Comments aren’t regular nodes of AST tree. They’re accessible through Comments field of ast.File (which is returned by f.ex. parser.ParseFile). Order of comments on this list is the same as they appear in the source code. To find comments inside certain block we need to compare positions like in findExampleOutput above:

var last *ast.CommentGroup
for _, group := range comments {
if (block.Pos() < group.Pos()) && (block.End() > group.End()) {
last = group
}
}

Condition inside if statement checks if comment group falls into block’s range.


As we see higher up the standard library gives great support in parsing. Utilities there made the whole work really pleasant and crafted code is compact.

If you like the post and want to get updates about new ones please follow me. Help others discover this material by clicking ❤ below.

Resources