Swift AST written in Swift. Part 2 of ∞

Alexey Demedeckiy
4 min readMar 14, 2017

--

In previous part I did build basic top level AST for Swift language. Nothing spectacular yet. Just some enums and strict data modelling.

Today I am going to dig little deeper and replicate Declaration section of a Swift grammar. Let’s look into it:

It is look all identical to statements, so lets write similar enum-style type union for it.

enum Declaration {
case `import`(ImportDeclaration)
case constant(ConstantDeclaration)
case variable(VariableDeclaration)
case `typealias`(TypealiasDeclaration)
case function(FunctionDeclaration)
case `enum`(EnumDeclaration)
case `struct`(StructDeclaration)
case `class`(ClassDeclaration)
case `protocol`(ProtocolDeclaration)
case initializer(InitializerDeclaration)
case deinitializer(DeinitializerDeclaration)
case `extension`(ExtensionDeclaration)
case `subscript`(SubscriptDeclaration)
case `operator`(OperatorDeclaration)
case precedenceGroup(PrecedenceGroupDeclaration)
}

One definitial from grammar is missed intentionally. Which one? declarations -> ... is never actually used in a grammar. So we don’t need it.

Import

Import grammar is our first non-trivial grammar. It is no so simple as it may look. Mostly because of hidden feature of importing not whole module but single definition from it. Here is grammar:

First line tells us that import statement consist of optional attribute list, keyword, optional kind specifier and path. This time we cannot model import-declaration as type union. We need single value of each type to form a correct type instance. This case is usually called type product.

In Swift we can model type products in a several ways:

  1. Tuples. This is really ad hoc type products. Easy to write, hard to support. I found tuple support in Xcode IDE little bit lacky, so we will look for another options.
  2. Class. The way to model type products in many OOP languages. Problems with classes (inheritance, behavior and mutability) makes this option nearly no-go for us.
  3. Struct. Good old struct / record. Simple and yet powerfull way to model type products. Perfect fit for our case.

Struct which model this grammar will look like this:

struct ImportDeclaration {
let attributes: [Attribute]
let importKind: ImportKind?
let importPath: ImportPath
}

ImportKind is a new case for us. No more reference to other grammar parts! This type of declarations is often called “terminal declaration”. In Swift we can model it as enum without extra types:

enum ImportKind {
case `typealias`
case `struct`
case `class`
case `enum`
case `protocol`
case `var`
case `func`
}

All cases of this enum is a reserved keywords in Swift itself. So we need to wrap them in a ` ticks.

Interesting fact about ImportKind — you cannot import operators from module.

Last part of declarartion is an ImportPath. Path is just a recursive list of identifiers connected via . symbol. How we can represent recursive list in a Swift? We cannot go with Array. Array can be empty, and we have strict requirement to have at least one element. Writing own list? Recursive data structures are not fun to create. Sometimes we cannot escape it, but usually we can. In this case we can model it as tuple of first and rest elements!

struct ImportPath {
let head: ImportPathIdentifier
let tail: [ImportPathIdentifier]
}

Does it solves the problem? Yes. Can we do better? Yes!

Problem is, that we forced all our clients to think about this limitation and treat head and tail separately. Let’t simplify life for them:

extension ImportPath {
var identifiers: [ImportPathIdentifier] {
return [head] + tail
}
}

Now everybody can use identifiers and not think about our implementation detail. But is it required to keep this information into our structure at all? If we want API clients to use identifiers, let’s store data in it. Thankfully swift allows us to rewrite default member-wise initializers:

struct ImportPath {
let identifiers: [ImportPathIdentifier]
init(head: ImportPathIdentifier, tail: [ImportPathIdentifier]) {
identifiers = [first] + rest
}
}

And each identifier can be either a regular idenrifier or operator. Wait. Operator? Looks like Swift grammar allow operator module names. Never seen this in a wild. However, I don’t want to go to deep and implement identifier today. We already get enough for simple import declaration.

Typealias

Second grammar for today: typealieas. Pretty simple after import:

We can see that grammar authors decide to make some indirection for typealias-name and typealias-assignement fields. I will take the risks, and model it all as a single struct:

struct TypealiasDeclaration {
let attributes: [Attribute]
let accessLevel: AccessLevelModifier?
let name: Identifier
let genericArguments: GenericArgumentClause?
let type: Type
}

A lot of new types here! As you can see, rabbit hole is really deep. To keep track of things I will cover grammar layer by layer, and not all branch at once.

Looks like my writing time ended up half a hour ago :) In next article I will try to cover let and var declarations. You can look at source from this article at this gist: https://gist.github.com/AlexeyDemedetskiy/42bbb19b18c28bf482b6ec7e07718182

--

--