Abstract Syntax Trees in Python (ast library).

Sergio Paniego
3 min readMay 21, 2018

--

An Abstract Syntax Tree is a simplified syntactic tree representation of a programming language’s source code. Each node of the tree stands for an statement occurring in the code. This trees don’t show the entire syntactic clutter, just the important information for analyzing the code. If it showed the entire structure it would be a Concrete Syntax Tree, but it’s usually better to simplify it because the information we use when building compilers can be found on an abstract syntax tree.

AST library

Python comes with a library built-in that makes it easier to work with Abstract Syntax Trees. The ast library helps processing Python abstract syntactic trees. The main purpose of the module is helping showing how the current grammar looks like.

NodeTransformer and NodeVisitor

Using the NodeTransformer class you can take a node from the different types it supports and change it at your will. The other important class that the module has is the NodeVisitor. This class helps us going though the tree, since every time a node comes by, the visit function gets call. We can redefine the functionality for every type of node when it gets visit, so we can better manage what we want to do with the nodes.

import ast
class
MyVisitor(ast.NodeVisitor):
def visit_Str(self, node):
print('String Node: "' + node.s + '"')

class MyTransformer(ast.NodeTransformer):
def visit_Str(self, node):
return ast.Str('str: ' + node.s)
parsed = ast.parse("print('Hello World')")
MyTransformer().visit(parsed)
MyVisitor().visit(parsed)

This is an example of how it works. We parse the code using the library, them transform the parsed code saying that every time we find an String node, it gets transformed adding that prefix. The we call the NodeVisitor and now when we find a String node we print that output. That output in this case will be String Node: “str: Hello World” because we transform the node and we them print it.

The library can also be used when executing Python code directly from source code. Like we do here

parsed = ast.parse("print('Hello World')")
parsed = ast.fix_missing_locations(parsed)
exec(compile(parsed, '<string>', 'exec'))

The fix_missing_locations function helps us when compiling because the compiler expects lineno and col_offset attributes for every node that supports them, so this function helps us filling those attributes. A great example of how the library works can be found on Eli Bendersky’s website.

Going further

A more complex example of how the library works can be found on my personal GitHub account. On this example I follow the same premise, but using a longer source code. This code is has functions, with if, while, and try/except structures. The NodeVisitor class is extended to cover a wide range of functionality. This library is quite easy to work with once you understand what an abstract syntax tree is and the purpose of those trees.

Conclusion

I have developed this exampled to check the suitability of the library on my 2018 GSoC project, which is mentored by JdeRobot . This project aims to translate Python code to Arduino code directly. The purpose of this workflow is to help kids that are learning robotics, so they can write Python code, which is easier and them take that code directly to the Arduino natively, without having to have the Arduino communicating to a computer while it’s running or any other previous solution we already have.

See you on the next post!

--

--