Adding a Type System to My Programming Language
This is the second part of my kasic series in which I build an interpreter that replicates a command line. Most of the top ten languages used today have some type-system either explicitly or implicitly. If you don’t know what a type system is or just want some more details about them check out this video. But the TLDR is using types, allows you to make assumptions about what kind of data you are using, without having to check every possibility.
Currently in kasic types are never declared this means commands must infer types at runtime when they execute. We will be implementing a basic type system that has three different types, bool, string and number. Bool is for any true or false values, a string for any text and a number is any double-precision floating-point number. Figuring out where to start is tough but I started at command inputs and returns.
Just adding type constraints to commands doesn’t mean our type system is done it just means we can’t pipe the output of a command that returns a string into a command that accepts a number. This type checking is done in the parser along with all other type checking. You can see the output from a successful pipe and a failed pipe below. This is our type system in action.
You can see from the error message that the type add returns is not what to_upper expects while mult does expect it. This usage of types is useful but very shallow, what if I replace one of the arguments for the add command to “hello”? Well, our add command tries to convert the text to a number and our program panics. Something to note is the error message, I haven’t discussed how these are constructed but we have a lot to cover so they are not a priority for now.
So let’s try and go deeper and make sure the values we pass to commands are can be converted to their expected types even if the command before says they should be. Getting this part of the parser working was the biggest challenge in the language so far but this is what’s needed to make the runtime type safe.
I think the best way to show how the parser keeps our runtime type safe is to actually look into some of the code for the add command, this Run method is called when we want to execute the command.
As you can see the add command doesn’t do that much but there is a lot of work done behind the scenes to get this method to where it is.
The first thing to might notice is the ArgObject which is given to this command by the parser. The parser populates this object with the arguments but as it is doing so it validates these arguments are the type this command expects. Then add calls AsNumbers() this returns the arguments inside as if they are numbers which for the add command they are because the parser makes sure of it.
After the command is done executing it needs to return something to allow the runtime to pipe the result to the next command. This is done with the ReturnObject because the ReturnObject takes the total value as is it means we can guarantee type safety with the C# type system. We pass the command to this as well as we need to check that the commands return type and the value returned are the same. This allows the command to do the work it is expected to do and not worry about if the values they are receiving.
Something that I haven’t mentioned yet but is very important in a type system is converting between types also called typecasting. Typecasting is a must but doing this in the parser would add unneeded complexity for us, so why not throw some new commands at it to fix the problem. Below is a demonstration of typecasting working in kasic.
This shows a single value being cast between the three types in kasic. It is not too difficult to figure what is going on here, the string command takes in a value and if it can, converts it to a string and so on. But this can raise a very important question, what type do the num, string and bool commands expect?
Well, this is where I must introduce a new type in kasic the any type. This type doesn’t have such an obvious description of what it is but all we need to worry about is that fundamentally it is just the raw string the user provides. Any is not used in other commands besides these specific ones so the idea is to use these commands as type declarations. If you pass “Hello” into num it tries to convert this to a number but if it can’t it just simply fails.
This, in short, is the new kasic type system, it may not be as fully featured as others but it is robust enough to give the programmer some feedback on where they went wrong.
Links:
Kasic: https://github.com/jackdelahunt/Kasic
Github: https://github.com/jackdelahunt