Writing a scripting language

For a long while I have wanted to improve my workflow when using procedurally generated content in my games, often getting tied up with implementation details rather than designing good algorithms. Recently I finally created a tool to help solve some of these issues: a custom scripting language to generate a grid of tiles, which can then be used to create objects in-game.

Note: The implementation/example code is given in C#, but should hopefully be clear enough to implement in other languages. If there is any confusion please let me know!

# basic roguelike rooms
# start from the center
Center
# build a starting room
Rect cursor 1 {3,6} {3,6}
loop {6,8}
# dig a corridor to the next room.
# this moves the cursor to the new position
Digger cursor 1 10 0.9 0.2
Rect cursor 1 {3,6} {3,6}
endloop
# outline the tiles with a wall
Expand8 1 2

It’s not so bad, apparently

If you’d have asked me a month ago about writing a language I would have said it’s a ridiculous endeavour and that I’d struggle to get it done. It just seemed so daunting, with little benefit compared to the work involved.

My intention was just to have a list of instructions that execute one after another, but after tinkering with ideas, I slowly realised what I was actually making, and it all started to seem less scary after all.

Starting with the basics

The first thing I did was to write up my basic ideas and thoughts about how it should work, both to get some advice and to think about how it would work from a higher concept level. I knew what kind of functions I wanted, and writing up this stuff helped me think about it much more clearly. This included an example script.

The idea was simple:

  • Read each line of text, and execute a function if it recognised it from the first word
  • Ignore any line that isn’t recognised, to allow for comments and blank lines
  • Replace {a,b} with a random number between a and b
  • Repeat code between loop and endloop blocks, the specified number of times
  • Execute specific lines of code with the execline function, as a simple “function” mechanism

It would run something like this:

This seemed like a simple enough concept for me to work with, but I was really unhappy with the execline function. I wanted it because I thought it would solve small code duplication problems (I could just change line 5 here and all the rooms would be consistent), but in practice it barely solved any problems, and meant I had to keep track of line numbers when *writing* the script. I wanted writing scripts to be as simple as possible.

Enter GOTO, RETURN, and LABEL

I didn’t initially want to touch these functions, because I had assumed they’d be too complex to work with. I wanted to make something really simple to implement, and not deal with “real” programming language features as much as possible.

However in practice, the idea was simple: first run through the script and find all the labels, and save what line they are at. Then, when you hit a goto, you store your current position, then jump to the label’s position. When you hit return, go back to where you last hit a goto .

The complexity I had imagined turned out to just be storing line numbers, which I already had to do with the loop function anyway. This seemed a lot simpler to tackle, and opened up a lot of potential for the scripts.

OK, but how does it *work*

When writing these things up and designing them, I had one simple concept for how the scripts would actually be run: it reads a line, does a thing, then moves to the next line. A list of instructions, just like a CPU would read.

The interpreter first then needs to take the script, and split it up into lines:

string[] lines = script.Split('\n');

Now I have each line of code as a separate string, in the lines array. Since I can now access each line, I can run through it like this:

int lineIndex = 0;
while (lineIndex < lines.Length) {
// Do something with lines[lineIndex]
lineIndex += 1;
}

Now I have it reading each line of the script individually, and I can do something with it.

A line of code in my scripting language looks something like this:

instruction arg1 arg2 arg3...

Where we have a keyword at the start to identify what the line does, and then a series of arguments to control how it does the thing. To make this easier to parse, we can once again use String.Split()

string[] thisLine = lines[lineIndex].Split(' ');
string instruction = thisLine[0];
// thisLine[1] to thisLine[whatever] are the arguments

So executing functions just means reading the instruction and calling functions depending on what it is:

if (instruction == "Rect") {
Rect(thisLine[1], thisLine[2], thisLine[3]);
}
void Rect (string tile, string width, string height) {
}

This is roughly what my first implementation looked like; I didn’t want to deal with all the loop/return/goto things yet, just getting the basic thing working.

Adding features

The next step for me was two important features: storing/using variables, and the {a,b} random parameter. The way this works is whenever I can use a number, it could also either be the name of a variable, or the random parameter.

The first step is giving the user the ability to store variables:

Dictionary<string, int> vars = new Dictionary<string, int>();
if (instruction == "set") {
vars.Add(thisLine[1], int.Parse(thisLine[2]));
}

So we now store a list of variables that the user declares. Now we need a function to take a string, and see if it is either a number, a variable, or a random parameter:

int ReadVar(string value) {
    // Try and read an integer
if (int.TryParse(value)) {
return int.Parse(value);
}

// Try and read a random parameter
if (value.Contains("{")) {
return GetRandom(value);
}

// Try and read a variable
if (vars.ContainsKey(value)) {
return vars[value];
}

// Couldn't read the value!
return -1;
}

Now we can correctly read the kind of variable, and so we can update the Rect code to look like this:

if (instruction == "Rect") {
Rect(ReadVar(thisLine[1]),
ReadVar(thisLine[2]),
ReadVar(thisLine[3]));
}
void Rect(int tile, int width, int height) {
}

It works!

We have a scripting language! At this point, we can now parse and execute code, and start playing around with the scripts themselves.

My hope with this article is to dispel the idea that writing a scripting language is as daunting as it seems, just as it was for me before I tried it myself. If you have any questions, or would like more explanation about anything, please do drop me a line at @jazzmickle on twitter.

Thank you so much for reading! If you enjoyed this article and would like to support my work, I have a Patreon! It really helps me do more work like this. 💜


# mountain generator
set height 10
# create some peaks
loop {5,8}
setpos cursor {1,30} {1,30}
Tile cursor height
endloop
label build
set nh height
minus nh 1
Noise height nh 0.4
Expand4 height nh
minus height 1
if height == 1
end
goto build

Addendum

Some ideas for expansion and little bits and pieces I missed out of the article for the sake of simplicity/conciseness:

  • The loop function in my language works like a label read at parse time; it stores the line where it last hit a loop, and the number of loops. endloop reduces that value, and if it is greater than 0, returns to the loop point.
  • I also had two other types than just integers in my scripting: positions and floats. Floats were always just 0.0–1.0, and positions were Vector2Int (a Unity type). This added some complexity to reading arguments.
  • In my own implementation, I used Reflection to call functions, instead of manually adding an if-statement for each command. It looks like this!
  • Speaking of if-statements, I didn’t cover them here. The basic idea is it checks the condition in the arguments, and skips one line further ahead. This is enough room to fit a goto, to execute different blocks of code.
  • Another useful feature, especially with if and goto, is end, to end execution immediately.
  • Regex is much more useful than just using String.Split(). I went with the latter for simplicity, but using regex can make it much more robust. (For example, the syntax breaks if the script uses two spaces in a row)
  • Mathematical functions in my language were in the format add var amount etc, instead of the typical var += amount, to make it fit the instruction args.. format. This again is very similar to how assembly works.

Also, thanks to the people who helped me out with both the scripting language and writing up this post: Anotak, Joshua Skelton, Simon Howard