Functional Parsers in Swift

Markus Kasperczyk
4 min readOct 22, 2023

--

Ever wanted to just (basically) write your context free grammars in Swift code and immediately parse them into a custom defined AST?

With my new Parser library, you can!

Example Grammar

Consider the following grammar:

   expr     -> term (('+' | '-') expr)?
term -> baseExpr (('*' | '/') term)?
baseExpr -> number | '(' expr ')'
number -> <defined using regex builder as en-US localized double>

This encodes basic arithmetic with doubles, including operator precedence. What we want to achieve is to take a string like

"6*(3+4)"

and spit out the correct result, in this case 42.

For this, we first define an Abstract Syntax Tree (AST):


// MARK: GRAMMAR

/*
expr -> term (('+' | '-') expr)?
term -> baseExpr (('*' | '/') term)?
baseExpr -> number | '(' expr ')'
number -> <defined using regex builder as en-US localized double>
*/

// MARK: EXPR

indirect public enum ExprAST {
case expr(TermAST, (Add, ExprAST)?)
}

public enum Add {
case plus
case minus
}

// MARK: TERM

indirect public enum TermAST {
case term(BaseExprAST, (Mul, TermAST)?)
}

public enum Mul {
case times
case divided
}

// MARK: BASE EXPR

public enum BaseExprAST {
case number(Double)
case parens(ExprAST)
}

Notice how the AST resembles the grammar. It’s basically just a representation of the grammar as a Swift type. It’s therefore quite simple to already write an interpreter for it:


// MARK: EXPR

extension ExprAST {
func run() -> Double {
switch self {
case .expr(let termAST, let optional):
guard let (op, expr) = optional else {
return termAST.run()
}
switch op {
case .plus:
return termAST.run() + expr.run()
case .minus:
return termAST.run() - expr.run()
}
}
}
}

// MARK: TERM

extension TermAST {
func run() -> Double {
switch self {
case .term(let baseExprAST, let optional):
guard let (op, term) = optional else {
return baseExprAST.run()
}
switch op {
case .times:
return baseExprAST.run() * term.run()
case .divided:
return baseExprAST.run() / term.run()
}
}
}
}

// MARK: BASE EXPR

extension BaseExprAST {
func run() -> Double {
switch self {
case .number(let double):
double
case .parens(let exprAST):
exprAST.run()
}
}
}

Great! If we had an instance of Expr encoding our computation, we could already return a Double.

However, writing an Expr in Swift is quite annoying. Also, our goal was to parse it from a String. Therefore, it is now time to actually write the promised Parsers.

For this, we need to include the “Parser” dependency to our Swift package:

...
dependencies: [.package(url: "git@github.com:AnarchoSystems/Parser.git",
branch: "main")],
...

And now, let’s go ahead and write our parsers:


import Parser
import RegexBuilder

// MARK: EXPR

public struct Expr : ParserWrapper {

@ParserBuilder
public var body: some Parser<Substring, (TermAST, (Add, ExprAST)?)> {
Term()
add-?
}

public func transform(_ bodyResult: (TermAST, (Add, ExprAST)?)) -> ExprAST {
.expr(bodyResult.0, bodyResult.1)
}

@ParserBuilder
var add : some Parser<Substring, (Add, ExprAST)> {
Exactly("+").output(Add.plus) | Exactly("-").output(Add.minus)
Expr()
}

}

Don’t worry, at this point, the code is not supposed to compile, as we haven’t defined Term yet. Before doing this, I wanted to take a break and point out what exactly is going on.

We define a ParserWrapper Expr with a required body property, which is a Parser<Substring, (TermAST, (Add, ExprAST)?)>.

The Substring tells you that the parser will operate on Substrings (and on Strings via a convenience method).

The (TermAST, (Add, ExprAST)?) tells you what the Substring will be parsed into. It should look familiar, because it’s exactly the associated types of ExprAST.

Since (TermAST, (Add, ExprAST)?) is not handy to work with, this ParserWrapper defines a transform method that transforms the output of the body property into an ExprAST. This is an optional step. The default implementation of the transform method assumes that body already produces the final output.

Finally, we have a custom property called “add” which is a Parser<Substring, (Add, ExprAST)> — this outputs exactly our “(Add, ExprAST)?” except without the “?”.

Now, let’s look at how the properties are actually implemented.

@ParserBuilder
public var body: some Parser<Substring, (TermAST, (Add, ExprAST)?)> {
Term()
add-?
}

The -? is a postfix operator indicating that there can be zero or one of the previous pattern (Swift unfortunately doesn’t allow ? to be a postfix operator). Compare this with our grammar:

expr     -> term (('+' | '-') expr)?

This is exactly the same! Except that we wrapped “term ((‘+’ | ‘-’) expr)” into the additional property “add”:

@ParserBuilder
var add : some Parser<Substring, (Add, ExprAST)> {
Exactly("+").output(Add.plus) | Exactly("-").output(Add.minus)
Expr()
}

Some mapping to get the output type right aside, this is once again exactly the same as in our grammar.

Now let’s define the other parsers!

// MARK: TERM

public struct Term : ParserWrapper {

@ParserBuilder
public var body: some Parser<Substring, (BaseExprAST, (Mul, TermAST)?)> {
BaseExpr()
mul-?
}

public func transform(_ bodyResult: (BaseExprAST, (Mul, TermAST)?))
-> TermAST {
.term(bodyResult.0, bodyResult.1)
}

@ParserBuilder
var mul : some Parser<Substring, (Mul, TermAST)> {
Exactly("*").output(Mul.times) | Exactly("/").output(Mul.divided)
Term()
}

}

// MARK: BASE EXPR

public struct BaseExpr : ParserWrapper {

public var body : some Parser<Substring, BaseExprAST> {
Number().map(BaseExprAST.number) | parens.map{_, ast, _ in .parens(ast)}
}

@ParserBuilder
var parens : some Parser<Substring, (String, ExprAST, String)> {
Exactly("(")
Expr()
Exactly(")")
}

}

// MARK: NUMBER

public struct Number : ParserWrapper {

public var body: some Parser<Substring, (Substring, Double)> {
Regex {
TryCapture(.localizedDouble(locale:
.init(languageCode: .english, languageRegion: .unitedStates))) {match in
Double(match)
}
}
}

public func transform(_ bodyResult: (Substring, Double)) -> Double {
bodyResult.1
}

}

Once again some mapping aside, most of the parsers are exactly what the grammar says. The only exception is the Number parser, which is just a regular expression built using Regex builder.

This is it! Now, we can write a simple XCTestCase:


final class MyLangTests: XCTestCase {

func testMyLang() {

var run = false
for (ast, tail) in Expr().parse("6*(3+4)") {
XCTAssertEqual(tail.count, 0)
XCTAssertEqual(ast.run(), 42)
run = true
}
XCTAssert(run)

}

}

And it just works!

Happy coding!

Final note: Do stand with Palestine, not with Apartheid! Apartheid is ultimately responsible for all civilian casualties in the ongoing conflict!

--

--