What does it mean to write a programming language?

Yasmine Hartung
There She Codes
Published in
3 min readJul 9, 2019

When you start learning a new programming language, you often start off by learning a little of the history behind the language, like how Javascript was written in 10 days or that Ruby was written by Matsumoto to be a simple, truly OO language.

But what does that even mean, to write a programming language?

Well, honestly, basically anyone can write a programming language and post it to github. This can be a very educational process, though also very long.

So how do you start writing a programming language?

Interpreted vs Compiled

Interpreted languages step through the code as it’s called, figuring out what to do as it goes.

Compiled languages figure out what they have to do at the beginning, change it all to a machine language that runs faster, and saves all the data for later, whenever it may be needed.

So how do you decide which language you want to write?

Interpreted languages are usually easier to work with and more flexible but compiled languages have better performance. This is obviously a very high overview of the issue, but when you come to write a language, one will probably make more sense than the other for your specific uses.

Implementation Language

The second step to writing a programing language is choosing which other programming language you will be writing it in

I know, it’s weird, but all programming languages are written in another language. somehow, the message has to get down to binary.

For example, Ruby’s implementation language is C. Javascript can have many implementation languages, according to the browser running the Javascript, but it is usually implemented with C or C++. Some browsers even run it on Java. That being said, you should try to decide on a single implementation language in order to optimize performance.

When thinking about which language will implement your code, you should think about whether your language is compiled or interpreted. If your language is interpreted, a compiling implementation language is usually preferable as it doesn’t compound the performance issues that come the interpreted languages.

If your language will be compiled, an interpreted language should be good enough.

Lexer

Next, you have to build (or outsource) your lexer. Whats a lexer? It basically chops up each line of code to tokens. What’s an operator? What’s an identifier? Whats a number? It’s responsible for giving each line of code it “role”. It writes all this information out as a list, ready for the next step of interpreting the code.

Parser

After a lexer you have to build or outsource your parsers. The parser's job is to order the list of tokens. It does things such as take into account parentheses and order of operations.

It splits the tokens into an Abstract Syntax Tree which is a node tree. It starts by evaluating the bottom and works its way up.

From the Abstract Syntax Tree, you have to build something for the code to actually run on. This requires things like context and certain nuances that might be specific to your language. This is where the actual language implementation is required. Does your language have global variables? Does it cement which data types are accepted as arguments for functions?

Once you write this out, sometimes referred to as intermediate representation, all you have left is to build your compiler or interpreter and you're done!

jk, that's a lot. But now at least you know the path. Go forth, my son, and create programming languages.

--

--