SwiftSyntax is a library that provides a Swift abstraction on top of the libSyntax, exposing a set of APIs that makes possible to do things like visiting, rewriting and retrieve information from the syntactic structure of a swift source.
So, in today’s article, we are going to play around and explore a little bit the SwiftSyntax library to understand more about how it works and how we can use to create things that can help us solve some problems.
So, let’s dive in …
Before we dive into the SwiftSyntax, we need to understand at least in a high level some things about the compiler flow.
The swift compiler takes the Swift source code, handles it to a hand-coded Lexer that tokenizes and transforms it into an Abstract Syntax Tree(AST), after that there is the Semantic Analysis (Sema) where the compiler takes the AST generated by the parser and makes a type-checked AST and check for semantic issues on that. Then the Swift Intermediate Language Generation (SILGen) phase transforms the AST generated by the semantic analysis into what they call raw SIL, after some optimizations on the raw SIL such as generic specialization, ARC optimizations and so on … it generates what they call canonical SIL that is then handed to the IRGen to generate the Intermediate Representation(IR) to be passed to LLVM for it to continue the job and generate the object files (.o) that later will be glued together by the linker and generate the final binary for any given platforms.
That’s is a brief high-level overview of the compiler pipeline. Here we are going to focus more on the AST because on SwiftSyntax we work basically on a representation of it in the form of Syntax nodes. So, to understand the terminology in this post, we just need to know that an AST is the representation of a source code syntax in a form of a tree structure.
Let’s see a simple example:
The AST representation for this simple example will look like
The image above is the representation of the code sample in the form of a tree and we can notice that it represents the pre-type-checked AST, still with no semantic information and only syntactic information. Note that on top we have a struct_decl and inside as substructures, we have the var_decl and func_decl which can contain their own substructure and so on … also we have the syntax tokens like brace_stmt.
SwiftSyntax provides this representation in the form of Syntax nodes and we are able to navigate through it using Visitors and also make changes on this structure using Rewriters (we are going to talk about them later in this post). Each kind of structure has a type representation e.g. struct_decl has the SwiftSyntax struct StructDeclSyntax type that represents it as a syntax node.
Now that we know the basic concept of an AST, let’s talk about the SwiftSyntax \o/
On the libSyntax read me:
This library implements data structures and algorithms for dealing with Swift syntax, striving to be safe, correct, and intuitive to use. The library emphasizes immutable, thread-safe data structures, full-fidelity representation of source, and facilities for structured editing.
In other words, it will provide the basic blocks necessary for us to perform safe and reliable analysis and editing on a swift source syntactic structure with a nice and easy API.
We’ll not be going to dive deep into much detail here on how the source data is represented internally in the libSyntax, so we’ll focus more on the high-level swift API provided by SwiftSyntax that we use as a client.
There are in detail documentation on it in the Internals section from the libSyntax readme file. And also, there’s a section in the Harlan Haskins talk on try! Swift NYC 2017 about how libSyntax represents the syntax tree internally.
But a brief summary, the libSyntax divides the representation of the AST in:
Syntax: Or Syntax Nodes are the representation provided for the public API.
RawSyntax: Is the internal raw immutable backing store for all Syntax. Store data like token kind and also represents the subtree structure.
RawTokenSyntax: special cases of
RawSyntax and represent all terminals in the grammar.
Trivia: represents all the syntax parts that have no semantic meaning to the source, e.g. whitespaces, newlines and comments.
SyntaxData: It wraps RawSyntax nodes with some additional information: a pointer to a parent, the position in which the node occurs in its parent, and cached children.
Here we are going to see a few high-level APIs that we can use on SwiftSyntax and some sample codes of how to use it.
Provide a nice and simple API for creating any SyntaxNode in a single line way. So instead of having many constructors on each Syntax class, we use the factory class.
Every SyntaxNode have with methods that allow us to create a node from another with only the with part modification. So, because the Syntax nodes have the concept of immutability it creates a new node with the same data than the node it was called on, but with the with part replaced.
With the SyntaxVisitor, we can walk through the syntax tree. It is useful when we want to extract some information to perform analysis on the source.
The return is a continuation kind indicating whether to continue and visit the children (SyntaxVisitorContinueKind.visitChildren) nodes on the Syntax tree or skip it (SyntaxVisitorContinueKind.skipChildren).
The SyntaxRewriter allows us to modify the structure of the tree by just overriding the visit(some syntax …) method and returning the new node based on a rule.
Note: All the nodes are immutable, so we don’t modify a node, but instead, we create another (using a with API) and return it to replace the current one.
In the example above we replace all the string literals on our code for a 🐱.
There is a lot more on the public API, but I think those are the main basic APIs that allows us to get started with SwiftSyntax.
Although there is a note in the SwiftSyntax readme saying:
Note: SwiftSyntax is still in development, and the API is not guaranteed to be stable. It’s subject to change without warning.
we can already do cool things with it, and there is already a bunch of tools that people are using that are powered by SwiftSyntax. There are tools for code formatting, for detect unused code and even the Swift Stress Tester is build on top of it, and there is a lot more.
To wrap up this article, a shot out to the teams and people involved on the libSyntax and SwiftSyntax, although is still a work in progress, is amazing the possibilities of what we can do with it, so congrats to the teams and people involved.
That’s all for this article \o/
If you got some comment or question, please let me know. Your feedback is really important so we can improve this and the future posts and it will be great to receive it :))
Thanks for reading :)
- SwiftSyntax Repo. https://github.com/apple/swift-syntax
- apple/swift/lib/Syntax Docs and examples. https://github.com/apple/swift/tree/master/lib/Syntax
- try! Swift NYC 2017 — Improving Swift Tools with libSyntax. https://www.youtube.com/watch?v=5ivuYGxW_3M
- Swift Forums. https://forums.swift.org