Implementing a Programming Language in Swift — Part 6: Parsing Variables
NOTE: This is the sixth part in a tutorial series on “Writing a Programming Language in Swift.” Be sure to check out the previous ones.
In the previous tutorials, we created our first interpreter. Now it’s time to pimp it up with some real programming language features, starting with support for variables.
For variable support our language needs two additional features:
- Support for parsing variable names
- Support for declaring variables
This tutorial is all about number one (“Support for parsing variable names”).
Parsing Variable Names.
There are two error cases that this feature creates, let’s start by adding them to our Parser.Error
enum:
enum Error: Swift.Error { case expectedNumber case expectedIdentifier case expectedOperator case expectedExpression case expected(String) case notDefined(String)}
Adding support for parsing variables is fairly trivial, there are just a few simple steps needed.
First we need to add support for variable names in our Lexer. To do this all we have to do is add a case identifier
to our Token
enum as well as a generator for it:
enum Token { typealias Generator = (String) -> Token?
case op(Operator) case number(Float) case parensOpen case parensClose case identifier(String)
static var generators: [String: Generator] { return [ "\\*|\\/|\\+|\\-": { .op(Operator(rawValue: $0)!) }, "\\-?([0-9]*\\.[0-9]+|[0-9]+)": { .number(Float($0)!) }, "\\(": { _ in .parensOpen }, "\\)": { _ in .parensClose }, "[a-zA-Z_$][a-zA-Z_$0-9]*": { .identifier($0) }, ] }}
Don’t be afraid of the complexity of this regular expression, it’s actually very simple.
The first bit [a-zA-Z_$]
states that an identifier must start with either a letter, an underscore or a dollar sign.
The second bit [a-zA-Z_$0-9]*
states that the first bit is followed by any sequence of (*
means zero or more) letters, underscores, dollar signs or numbers.
Great, now it’s time to edit our Parser. Essentially, identifiers represent values, so we will edit our Parser’s parseValue
method to support identifiers:
func parseValue() throws -> Node { switch (peek()) { case .number: return try parseNumber() case .parensOpen: return try parseParens() case .identifier: return try parseIdentifier() default: throw Error.expected("<Expression>") }}
Currently there is no method called “parseIdentifier” so we have to add it as well to our parser. Its implementation will be very similar to parseNumber
:
func parseIdentifier() throws -> Node { guard case let .identifier(name) = popToken() else { throw Error.expectedIdentifier } return name}
As you can see, we are still getting a compiler error. This is because we are expected to return a Node
, but instead we are returning name
, a String
.
To fix this we should make strings conform to the Node
protocol.
The Node protocol needs us to implement the interpret
method returning a Float
.
For variables names, this method should check if the variable is defined, if it’s not, we throw an error, if it is, we should simply return its value.
For fetching variables declarations we first need to be able to store them. A simple dictionary should do the trick:
var identifiers: [String: Float] = [ "PI": Float.pi,]
I’ve added the variable “PI” just for the sake of testing, as we haven’t yet added support for variable declaration (subject of next tutorial).
Great! We are now ready to make String
conform to Node
:
extension String: Node { func interpret() throws -> Float { guard let value = identifiers[self] else { throw Parser.Error.notDefined(self) } return value }}
That’s it! We are now able to parse variables:
let code = "50 * PI"let tokens = Lexer(code: code).tokenslet output = try Parser(tokens: tokens).parse().interpret()print(output == (50 * Float.pi)) // true
Next up variable declaration!
Stay tuned!
I hope you guys found this tutorial interesting and as always, don’t be shy to give feedback e.g. by clapping or writing a response.
P.S. Remember that you can follow me here on Medium for notifications on future tutorials.