I Made My Own Programming Language

It sucks, but I love it

Published in

HumanAI

8 min readAug 14, 2024

In my years of learning software engineering, I’ve had the opportunity to work in a number of languages: C#, the best language ever; Python, the warm teddybear of languages; Fortran, the language of dynosures; Haskell, the most mathy language ever; HTML and CSS, not sure if they’re actually languages; and Javascript, the only language I hate. Each of these languages has its own unique features; some that I love, and some that I hate. I decided though, that I needed to add another language to my eclectic little language group. The language I set out to learn was Rust. I started learning Rust the same way I learn any other language. I found a tutorial that walked me through a small project to learn the syntax and the linguistics, and then I made a couple more small projects without a tutorial. It was at that point that inspiration struck. I had finished all my major projects and needed a new one. I have always heard that one of the most useful projects anyone can do to learn how to become a better software engineer is a custom programming language. Therefore, as expected, I decided to create my own programming language. This brings me to lesson #1 I learned: just do it; a sixteen-year-old kid is the last person anyone would expect to take on a project like this, and I was way over my head, but It was totally worth it. Just because a project seems too hard doesn’t mean you shouldn’t do it. With that spirit in mind, I started my research. Usually, when starting a larger project like this it’s a good idea to build an outline of breaking down how the program will work and how you will approach each problem recursively breaking them down into smaller problems. So, how did I go about making this outline? Well, I didn’t; I instead spent less than ten minutes reading an article about how to go about making a language and jumped in. Guess what, I learned lesson #2. Do the research, I know it’s tempting just to jump in and start programming, but you need to build the outline. So there I went, jumping right in, creating my own programming language, TermsLang.

I knew from my highly extensive and exhaustive research that the first step to building a programming language is a Lexer. A lexer just takes in a block of text and converts it into a list of tokens. What is interesting about the lexer? That’s easy; it defines the keywords and symbols used in the language. Here is a basic overview of some of the more unique lexer mappings.

"Hello" 'To The' `World`      <- Strings
# Hello world                 <- Comments
~                             <- Line terminating character (not a semicolon)
$                             <- New object operator
^                             <- Exponent operator

updt                          <- Update the value of a variable
cll                           <- Call a function
loop                          <- Create a loop

For the most part, these seem acceptable. The ones that stand out are “~” “$” and “^” In the case of the dollar sign in place of a “new” keyword and a tilde instead of a semicolon at the end of lines, I just felt like it. The carrot is the exponent operator instead of a bitwise because that's the way it should be, fight me. Admittedly there is no bitwise in TermsLang so the character was vacated, don’t actually fight me. The three keywords I listed are also a bit alarming, but I will explain them in a minute.

After creating my Lexer I started on a parser. A parser takes the list of tokens from the Lexer and organizes them into the grammar of the programming language. This is where the rest of the syntax for the language is defined. Let’s take a look at an example.

"
My FizzBuzz Program
Created Tue Aug 13, 2024
"

struct FizzBuzz {
    let int iters ~

    func null @new: int iters {
        updt @this.iters = iters ~
    }

    func null run: str _3Word, str _5Word {
        loop i: i < @this.iters {
            let str result = "" ~

            if i % 3 == 0 {
                updt result += _3Word ~
            } else if i % 5 == 0 {
                updt result += _5Word ~
            } else {
                updt result = i.@str.() ~
            }

            println result ~
        }
    }

    func str @str: {
        return "FizzBuzz Object: % Iterations" % @this.iters.@str.() ~
    }
}

func int @main: str[] args {
    let FizzBuzz fizzBuzz = $(100) FizzBuzz ~
    cll fizzBuzz.run.("Fizz", "Buzz") ~
    
    println "-" * 35 ~
    println fizzBuzz.@str.() ~
    return 0 ~
}

Ok, let’s talk about what’s going on here. For the most part, it reads like a normal language, but it does have some nuances. Let's start with the “@” symbol. Basically, the “@” represents anything that has a specific behavior. Type conversion functions, the “@main” function, and the “@this” variable all have special behaviors tied to them. In addition to this, the “@readln” function and a number of others are also precursed with an “@” symbol. What else though? Well, you may have noticed the “updt” and “cll” keywords. When I was creating the parser I realized that it was easier to parse the terms inside of a function if they always start with a specific keyword that tells the parser what to expect on that line. This is actually why the language is called TermsLang. Because each line is a single term precursed by a keyword identifying type of term. Therefore I created the “updt” keyword to represent lines that update a variable, and I created the “cll” keyword to represent lines that evaluate an expression and then drop the result. This is typically used when you call a function. You may have noticed at the top of the file there is a string with some information. To be honest, I had a bug where the first token was always ignored when parsing. Instead of fixing the bug, I decided to require the first token to be a string called a prelude string. Lastly, the absolute worst thing about TermsLang: dots before parenthesis. Yes, you must put a dot before parenthesis when calling a function. That means this is invalid, “myFunc()” and this is valid “myFunc.()” I am so sorry to anyone who now tries to program in this language. I did this when creating the parser, It was just easier to parse this way. When I got to building the interpreter though it was an absolute nightmare and now it’s an absolute nightmare to program with. Once again sorry. One of these features on its own wouldn’t make a language all that painful to use, but all of them together… Did I mention I was sorry?

The next part of a typical programming language would be taking the Parsing output and turning it into an Active Syntax Tree. This is where you would do a lot of type-checking and validation. I deleted it. Yup, I made one then I deleted the entire module. Guess what. Lesson #3 just because it's a convention doesn't mean it's a requirement. Sometimes the different thing works better for your specific project.

Ok, now that we skipped that step let's do the last thing part of my language: the interpreter. I originally wanted to compile TermsLang but decided against it when I attempted to install LLVM on a Macbook from 2012 and then on a newer Windows machine and failed twice until I pulled up a Kali Linux virtual machine. Unfortunately, I like working from my Macbook and not my desktop so interpreting is better. Creating the interpreter was really fun when it started. That is until I ended up writing this.

fn resolve_root_sub(
    &mut self,
    root_def: StructDef,
    object: &Object,
    id: &String,
    parent: u32,
    vr: &VariableRegistry,
) -> Result<u32, RuntimeError> {
    match root_def {
        StructDef::User { .. } => todo!(),
        StructDef::Root { _type, methods, .. } => {
            if methods.contains(id) {
                match &object.sub {
                    Some(sub) => match &sub.kind {
                        ObjectType::Call(call) => {
                            let mut data_object = self.objects[&parent].data.clone();
                            let args = {
                                let mut args = Vec::new();
                                for arg in &call.args {
                                    args.push(interpret_operand_expression(&arg, self, vr)?)
                                }
                                args
                            };
                            let result = match data_object.call_method(self, id, args)? {
                                ExitMethod::ImplicitNullReturn => todo!(),
                                ExitMethod::ExplicitReturn(id) => Ok(id),
                                ExitMethod::LoopContinue => todo!(),
                                ExitMethod::LoopBreak => todo!(),
                            }?;

                            match &sub.sub {
                                Some(sub) => {
                                    self.resolve_sub_object(result, Some(parent), &*sub, vr)
                                }
                                None => Ok(result),
                            }
                        }
                        _ => Err(RuntimeError(
                            format!(
                                "{}{}",
                                "Root type functions must be called; alias creation,",
                                " indexing, and object peeking is not allowed."
                            ),
                            FileLocation::None,
                        )),
                    },
                    None => Err(RuntimeError(
                        format!(
                            "{}{}",
                            "Root type functions must be called; alias creation,",
                            " indexing, and object peeking is not allowed."
                        ),
                        FileLocation::None,
                    )),
                }
            } else {
                Err(RuntimeError(
                    format!("{} no field or method found on struct", id),
                    FileLocation::None,
                ))
            }
        }
    }
}

If that looks like bad code don’t worry, it is. The interpreter is quite simply some of the worst code I have ever written. That's where I’m going to leave that. Other than the code under its hood, the interpreter commits two other crimes. First, type annotations. Every variable and argument in TermsLang needs to be type annotated. Don’t get me wrong, I love type annotations. Type annotations can make or break a language (C# vs JS). What’s the problem then? Well, there's no enforcement. Not only did I drop the active parser stage but then I also decided not to do any type checking later. You can pass anything to anything. That means you can pass a string to an integer argument with no problem. As long as you don’t try to access a field that doesn’t exist no error will be thrown. The second crime is speed. The language is ungodly slow. I’m not entirely sure why but I have a guess. In the language, every value is stored in an enum variant and pointed to by an integer pointer within a hash map. This system allows me to create a garbage collector but it’s not at all efficient. Despite this, I have a working interpreter. This means I have a completed custom programming language.

After spending so much time explaining everything that is wrong with TermsLang, you may be wondering, “Why on earth do I still love this language?” It’s simple actually. It’s my own language. I learned an astounding amount by making TermsLang. I love the fact that a sixteen-year-old made it. I love the fact that I thought there was no way I’d be able to create a programming language. That leads me to the last lesson, #4, a good project is a project that teaches you. I can’t speak for anyone actually employed as a software engineer, I’m just a Jr. in high school who just turned seventeen, but in the case of people trying to learn, there is no better way to learn than by fun projects large and small. So, is TermsLang practical? Absolutely not. It sucks, but I love this language. If not only for the reason that I created it.

If you for some reason read all of this thank you. I hope you appreciate this. Finally, of course, I cannot write all of this without linking the GitHub repo.

I Made My Own Programming Language

It sucks, but I love it

Written by Owen Dechow