Photo by Faye Cornish on Unsplash

Exploring the Abstract Syntax Tree

Sheldon Nunes
Happy Giraffe
Published in
3 min readNov 6, 2018

--

Hacktoberfest had me get out of my comfort zone and contribute to different codebases. One of them, a python linter gave me exposure using the python Abstract Syntax Tree.

What is an Abstract Syntax Tree?

An abstract syntax tree (AST) is a way of representing the syntax of a programming language as a hierarchical tree-like structure.

In essence we can take a line of code such as this:

pressure = 30

and convert it into a tree structure:

Tree visual representation

Wikipedia has a slightly different definition:

An AST is usually the result of the syntax analysis phase of a compiler. It often serves as an intermediate representation of the program through several stages that the compiler requires, and has a strong impact on the final output of the compiler.

So the AST is one of the stages towards creating compiled code. Definitely feels like we are getting closer to what the machine understands!

What this enables us to do is to step through the structure of a program and report any issues back (similar to intellisense/linters) or even change the code that is written.

Python provides a library for parsing and navigating an Abstract Syntax Tree and is aptly called ast.

Using the previous example we can create an ast by using the following command:

import astcode = 'pressure = 3'
tree = ast.parse(code)

Simply printing the tree won’t display the structure and nodes that we want. Instead we can create a node visitor that will traverse the tree and give us details on each of the nodes:

class Visitor(ast.NodeVisitor):
def generic_visit(self, node):
print(type(node).__name__)
ast.NodeVisitor.generic_visit(self, node)

Now if we create an instance of this visitor class, when we call visitor.visit(code) we will get the following output:

Module
Assign
Name
Store
Num

For a linter this is quite useful to see if any of these node orderings are invalid. A good example of this is when you have if True:… a redundant if statement since it always evaluates to the same result.

When we run visit on an if True:… we get the following:

Module
If
NameConstant
Expr
Ellipsis

In this case the True value is the NameConstant. The ast visitor allows us to create specific methods that get invoked when a particular ClassName is visited. This is achieved using the following syntax:

def visit_{className}(self, node):

In this case we are wanting to visit any If node and check it’s test condition to ensure that it isn’t just a NameConstant (True, False or None). This is where the ast documentation is quite useful as you can see what properties are available for each node type. We can access the node.test condition like so:

statement = "If statement always evaluates to {} on line {} "def visit_If(self, node):
condition= node.test
if isinstance(condition, ast.NameConstant):
print(statement.format(condition.value, node.lineno))
ast.NodeVisitor.generic_visit(self, node)

Running this on our previous example gives us a nice detailed message:

If statement always evaluates to True on line 1

You are not limited to only ‘visiting’ a node in the AST. Using another one of pythons classes you can use ast.NodeTransformer to modify them too! This leads to really cool possibilities like inserting temporary lines of code to test code coverage of your program or even transpile to other languages

I recommend checking out the following resources if you are looking to make use of ast in python:

A copy of the code can be found here

Thanks for reading!

--

--