Typescript Is written in… Typescript?: A Look Into How Languages Are Written in Their Own Language.

Joseph A. Boyle
The Startup
Published in
6 min readAug 10, 2020
Photo by Matthijs van Schuppen on Unsplash

It may come as a surprise to learn that C is largely written in C, that Java is written in Java, and that Typescript is written in… Typescript! To many, this seems like a technological version of the “which came first, the chicken or the egg?” question. This sure seems like a contradiction — if C is written in C, doesn’t C have to already exist?

Why do we write new languages?

Before we can explore how this conundrum is resolved, we must first ask ourselves why new languages are ever even made at all. Surely, between all of the existing languages, one has to be viable for the problems we’re trying to solve, so why bother creating a new one? Largely, new languages fall into one of several categories:

New Paradigms: Some languages simply seek to explore new programming paradigms, offering a “new way of thinking”. Procedural programming languages like C differed greatly from object oriented languages like Java or functional languages like Lisp. New ways of thinking about programs yield new levels of expressibility that may simply not exist in other paradigms — imagine trying to implement an asynchronous map reduce in C, or Objects in Lisp. As always, there’s generally tools better suited for certain jobs, and sometimes making a new tool is the best solution.

New Domains: Similar to exploring new paradigms, some languages simply aim to target new domains. As an example, HTML, Verilog, Matlab, and other languages were created simply to solve a very narrow problem and work in a very specific domain. These languages solve novel problems without branching into the territory of a general purpose language.

Glaring Improvements: It is quite hard to fix fundamental issues in many languages. Sometimes communities are quite closed, and sometimes the language just has too much baggage and legacy applications using it that making changes would be impractical. Languages like Rust attempt to solve fundamental issues like memory safety in C, whereas languages like Python 3 attempt to solve structural problems in its predecessor Python 2, while Kotlin attempts to remove boilerplate and add additional safety to Java.

So How And Why Do We Write Language X In X?

In general, we write new languages because we find them to be better than what currently exists, whether that means better syntax, better expressibility, or just added safety. You may be wondering why we’ve taken time to consider why we create new languages. Put simply, we create new languages because they contain features or ideas that we just are dying to use. So why would we use another inferior language to develop our language?

The initial version of a language is written in something else. For C, it was B, and Typescript it was Javascript. The original source language doesn’t really matter that much, so long as you can land on some initial implementation of your language. This initial implementation generally isn’t complete, but rather is enough to get you started in working with your new language. Popular choices for an initial implementation include C, Scheme, or even a Compiler generator like Yacc/Bison. It’s not an easy feat, but eventually we emerge with Language V1, written in Some-Other-Language.

Next is the interesting part. We want to test how good our initial implementation is, to see how useful the language is, to find bugs, etc. What’s a really nice way to test this compiler we’ve built? Surely, we could try building an application in it, implement a web server, or any number of other common programming challenges. Instead, we’re going to test our language and implementation by writing our language in our language using the version 1 compiler. This initial implementation is going to be a bit bumpy, but the important bit is that we complete an initial implementation. This is called bootstrapping your compiler. Akin to the old expression “pulling yourself up by your bootstraps”, we are now using our compiler to create a new compiler, based in Language V1 instead of Some-Other-Language, to create Language V2. In the future, we’ll continue writing our additional language features in our language, and each time, we’ll generate a new compiler.

Back to the “Chicken or the Egg” analogy, the answer is quite simple — the egg came first; We started with a baby compiler written in some other language, and built up enough structure to allow ourselves to build a new compiler with that compiler that could still read our language. Bootstrapping a compiler is seen as the ultimate test of if the language is at all viable, for if you don’t even want to write your own language in the language you’ve created, it’s probably not worth using (or potentially isn’t defined well enough to be used). If we hadn’t started with that original compiler written in another language, it would of course be absurd to try to write our language in our language, as there would be no way to compile it to machine code (or some other target).

When looking at languages’ source code online, it’s unsurprising that you’ll often seen that they are written entirely in their own language. In a way, it’s akin to how we structure our own spoken languages — if you look at a dictionary, the definition of every word is just a collection of other words within the language. We are perpetually in a cycle of using our own language to define our language, and as such it’s hard to imagine the state of our language at first implementation. It’s no different in compilers, save for the technical specs.

Another upside of bootstrapping your compiler is that you can cash in on native calls for free. If your language’s original compiler was written in, say, C, you are given some flexibility when it comes to implementing certain features like IO. If your language implemented a simple IO interface, the original compiler could generate machine instructions using the same instructions C does, without having to get into the nitty gritty of how to actually implement the system calls or data management. Upon bootstrapping your compiler, you can make use of these function calls using the super simple IO interfaces that your language implements, whereas on the first pass you had to deal with the more complex C layer, which is in turn way more convenient than writing your own machine code subroutines.

What Does This Mean For Me?

Clearly, the idea of bootstrapping a language’s compiler isn’t as wild as it might have seemed at the onset of this article — we quite literally are just rewriting our compiler after the initial version. What does this mean for you, the reader, then? Ultimately, not much. If you’re interested in building a language, you may now have a better appreciation for the steps ahead of you before your language reaches full maturity.

If you’re not planning on building a language, though, you can take solace in the idea that the things you’re building and working on could become much bigger than themselves — they can be used to create great tools that one day are used as stepping stones for even greater tools. The languages that you’re writing in aren’t just some abstract implementation detail, but rather living, breathing, constantly updating specifications that can be used to recreate themselves, or even create something better.

If you take nothing else away from this exploration into bootstrapping, though, I hope it’s that the technology we work on is standing on the shoulders of giants. To understand the history of our tooling, and design choices that led to their development, would require a deeper understanding of the technology all the way down, for it is that technology which spawned it in the first place, only to be rewritten and abstracted away.

--

--

Joseph A. Boyle
The Startup

Senior Software Engineer at Dandy. Rutgers University Computer Science Alumni. Lover of low-level systems, vintage computers, compilers, and 3d printing.