Writing My First Interpreter (in Rust)
This project began with 1 goal in mind: “make a language with only functions” I don't want loops I don't want arrays I don't want lists I don't want structs. We can do all that with just a clever use of functions.
This language was conceived when I realized you can probably make “arrays” have O(1) insertion and read even with this restriction. And I was just DYING to write that optimization.
I am not there yet, It will probably require a JIT to work well. But before we can start on that we need an actual working language. We will also look at a few of the low hanging optimizations first.
Last article I talked about how I wrote the parser. It took a lot of effort and 2 fairly heavy dependencies. This time would be fairly different, I have only used the standard library. There were a few times I considered pulling in some data structures crate but I never found something that was worth learning a new dependency for.
On the one hand this is much more familiar territory for me as a C dev. But on the other hand I am very new to Rust and this is a big project. So there Bound to be some fuck ups along the way. Unsafe is an area of Rust I am especially unfamiliar with. Unfortunately for me there was an optimization where that's the only solution I can come up with.
The AST
So the AST we got our parser to make has 1 major issue… it does not differentiate between an actual value and a recipe for a value. So the statements:
a=1;
a=b;
a=b(1);
are all represented in about the same way. This may not seem like the biggest deal and in fact with this sort of expression it isn't. But For more complex cases this is definitely a problem…
something like:
a = complex_func(
other_func(a,10)(2),
match x {
:err => {return 6;},
_ => {x}
}
);
Is shown in the AST as
Assign(a,FuncCall(...))
Even though we have like 3 different places where this thing can exit before it even gets to calling “complex_func” or assigning it to a. We can get an exception when calling “other_func”. Or maybe x does not exist?
Maybe the match statement on x returns early? This would mean we never really do all of the other things. But we still keep on executing valid code. So the first Key distinction we have to make is between Terminal values. ie ACTUAL values we can return print etc. And Lazy values, which are things we need to evaluate.
This means we probably need to lower this AST into some representation which keeps track of that. We need an IR.
The IR
So when building the IR I had a few major goals for it:
- support all basic features of the language.
- look as similar as we can to the AST.
- don't memory leak.
So goal 2 may seem a bit odd, the main idea there is it would be much simpler to work with at first if we don't need to do massive work for translation. This sort of IR is called a tree-walk and its known for being simple.
So its a good stepping stone for getting a version of the language we can play around with.
Memory Leaks
Now you might think we wont have any leaks because we use Rust. But I am using a reference counter. So ya this bitch can leak as soon as we introduce a cycle. Using weak references can somewhat solve this but that's just pushing the problem back to a different form of manual management.
One advantage of weak over free is that instead of doing undefined behavior when you misuse it. It gives you a known error you can report for debugging. Still Both of these approaches are Unacceptable for a managed memory language.
There is one VERY important observation we can make about my language: Cycles are impossible. This is because we do not allow true mutation So a value may only refer to things that existed before.
Here is an example:
a = 5; #no references to this
b = fn() -> {a}; #b holds a1 now
a = fn() -> {b}; #a2 is created could not be referenced by b
There is one glaring exception to this. If a function is defined in the global scope then everything can always point to it. Which means these functions need a different allocation scheme.
Reference counting these makes little sense anyway because the global scope itself would always hold a reference. So its not like we ever free them early. We may as well leak them immediately to save ourselves the trouble. Oh wait!
Unsafe
This is where I found myself having to use unsafe. I am still not 100% sure that what I want cant be achieved by simply playing with lifetimes but after over 2 days of trying that compared to 2 hours for the unsafe solution I think I should let it go.
Basically the premise is incredibly simple: we construct the global scope. Then we “leak” it so Rust isn't on our case about how long it lives. Then we free it manually when we are done with running our program.
The reason this requires unsafe is that the reference counted objects live “forever” so to avoid a use-after-free we need to ensure the source code lives “forever”.
It was incredibly straight forward to write this and put safe abstractions around it. And after running a sanitizer twice I got the aliasing rules right and had no further issues.
Originally when learning Rust I was really excited for learning how to write unsafe code. And in many ways I still am. But this felt… bad. I felt like I could of found a safe solution.
Because I was told to leave my unsafe C ways behind. I went back to try and find a safe solution two times. In both times I did not saw a safe solution I was happy with. They all fundamentally forced my hand into leaking the source code of the program. Or going back to using weak pointers which as explained before is not going to cut it.
Profiling
So at this point I had a working language with most of the basic features. I wanted to see if I can improve the performance a bit, There were a few big spots where I just did the easy thing instead of the performant thing.
- I used a tree-walk
- I used a hash-table for looking up variables, they each have a unique Id so at least no need to hash a string.
- Using 3 machine words for representing function pointers.
- I new I had at least 1 useless copy somewhere.
So I looked into how to profile Rust and found this tool called flamegraph. Which did not really work for me since I had HEAVY recursion. But it did point me in the right direction because internally it was using perf so I started using perf.
perf report was REALLY good like the best experience I ever had reading assembly dumps. It does this neat thing where it shows you specific instructions that are slow right next to their source code
I looked at the code a bit and a lot of the performance cost seemed to be concentrated on loading things from memory.
The first obvious fix was to remove a clone and use a reference instead. The way I did global Functions was just clone and return the function code whenever someone asks for it.
Now what I did was store the functions memory in the same way that its used when called later, Then return a reference. Instant win, not a huge win but a noticeable win. It depended how much you used recursion but when fixed this was a BIG difference because its an O(1) rather than an O(n) cost for each function call.
Adding Features
At this point I was writing the language for a bit and I noticed that writing loops is a bit annoying. Now I could have just caved in and added loops but instead what I went with is recursive lambda functions. This is achieved with the self keyword and it looks like this
def loopy_main(system) {
print = system(:println);
0|> fn(n) -> {
match n<2 {
true => {
print('iteration number: '+n);
self(n+1)
},
false =>{}
}
}();
}
Well the lack of syntax highlighting is not doing me any favors here. But I think the basic premise of calling self is kind of nice.
This required a parent pointer which is tricky to do in Rust because of the ownership model. This is not the first time a parent pointer came up, when optimizing the memory layout we also had to set one up.
Originally I tried using unsafe and asked online, This got me really no where. It was to the point where the answer was “Rust can’t do it” and I considered implementing this part in C.
As far as I can tell in Rust with the default HashMap there is no way to safely write a parent pointer to that map. EVEN using raw pointers wont cut it or at least I have yet to see a solution that works.
The core issue is using a mutable pointer to mutate the object (I know crazy idea how dumb of me). which coming from C and C++ that’s the only way. But in Rust constant really means “shared reference” it just happens that you are not allowed to write to shared references by default.
There are however escape hatches in the standard library for doing just that. Cell for example lets you Set its inner object even with a shared reference.
So the solution I found was using Cell for the parent pointer field. on every read from our map we just set the parent pointer correctly. Its a bit clunky and it does come with a performance overhead but perf showed that it was virtually non existent so we can really just ignore it.
Tree-Walks are Bad
As I mentioned earlier tree-walks are not very performant. looking in perf a big chunk of the core bottle necks were on single seemingly random instructions that fetched memory.
This is probably because the CPU had to wait for the memory it was reading/writing to. Trees live in a lot of random memory locations. So they have just terrible memory locality.
There was also another incredibly frustrating issue. Because native stack sizes differs between machines. My windows machine would crash with a stack-overflow on code that I can run on Linux without issue.
So now I needed to implement a way to catch recursion going too deep. Which was kind of tricky to do, I ended up going the Python route of just counting scopes.
Still that overflow issue on windows is egregiously bad. It caps loop sizes at 300 iterations on a GOOD day. This wont really work for a scripting language that's big on predictability.
Which is a big sign we need to implement tail call optimizations for simple recursive functions. It would also be very nice if the stack size did not depend on the machine. There are multiple ways to achieve this that I am looking into. But this is for next time.
Conclusion
We made a basic interpreter in Rust. We even looked into optimizing some of the code.
This project was a very good test for the language. And I have to say it did very well. Compared to C optimizations did feel a bit harder to get right. And I did actually miss C macros in a few places (I know I have weird tastes).
I think once I get the hang of unsafe Rust optimizations would come more natural. Which will let me make code that performs just as well as C if I put the effort in.
Debugging in Rust feels a lot nicer than it does in C. The combination of the build in debug prints with most code just naturally being safe meant I never had to take out GDB. Which for a project of this size is impressive.
Making my language run for the first time felt nothing short of magical. There was a period of around 2 weeks where I was really unsure if my IR design is going to work.
That moment where a feature goes from “in development” to “in maintenance” always feels amazing. And from the little bit of maintenance I already did to this codebase this feels like its going to be a very fun project to work on.
I will keep you posted on how that goes
Neva.