The continual evolution of Airtable’s codebase: Migrating a million lines of code to TypeScript
By Caleb Meredith and Andrew Wang
Airtable’s codebase follows an ethos of continual evolution. We’re unafraid of ripping out old code when it no longer meets our needs or introducing new technologies and patterns to reflect the current state of the art. Today, we’re going to provide an overview of some of the step-change moments in the history of our codebase, and dive deep into the largest of these changes: our migration to TypeScript. We’re also open-sourcing some of the tools we used to pull off this migration successfully.
Let there be light
This story starts at the very beginning, with the genesis commit of Airtable’s monorepo:
commit 681429d64232f44c6af1a1d838b91fe39d52edb0
Author: Howie <howie@Howie-Lius-MacBook-Air.local>
Date: Sun Apr 22 15:55:06 2012 -0700beginning state
In 2012, the idea of server-side JavaScript was still relatively new. 2012 was the same year the iPhone 5 was released, and Psy’s Gangnam Style became the first video on the internet to hit 1 billion views. Our founders were making a gamble on both JavaScript and the JavaScript ecosystem with Airtable’s initial tech stack.
While some of these original choices have aged well (Node.js, Express, and JavaScript itself) other choices have not (Backbone, Underscore.js, EJS, and jQuery). This is where we return to our ethos of continual evolution. Through continued refactoring efforts over the last nine years, we’ve ensured our codebase is written in a consistent, relatively modern dialect of JavaScript, even as we’ve grown to over a hundred engineers and over a million lines of code.
A core sample of our codebase: Expanded Record
As an example, Airtable has an Expanded Record feature that was introduced in Howie’s genesis commit from 2012. Since then the core functionality has stayed mostly the same, but the texture of the code has changed significantly. Here’s the first 50 lines of the Expanded Record feature (known as Detail View internally) over the last 9 years:
These snippets provide a view into some of the major moments of Airtable’s codebase:
- 2012: First commit to the repository
- 2015: More people start working on the codebase, shared conventions established
- 2016: Browserify introduced, explicit CommonJS imports added
- 2018: Backbone style class converted to ES6 style class
- 2019: Custom component framework converted to React component
- 2019: Migrated from Flow to TypeScript, CommonJS imports replaced with ES6 imports
- 2021: Replaced
createReactClass
and mixins with class React components
The history of Expanded Record demonstrates our willingness to perform large-scale refactors that touch even the oldest parts of the codebase. We reject the idea of unmaintainable code as an inevitability. Instead, we see code quality as a responsibility borne by all of engineering. As an example, the above changes to Expanded Record were performed by different engineers from different teams, over the course of years.
In the remainder of this blog post, we’ll cover one of the best examples of our ethos of continual evolution: our migration to TypeScript. This is a story of how a single, highly motivated engineer was able to execute a refactor that touched nearly every file in our million+ line codebase, and how engineering as a whole has since taken up the mantle and continued to ship additional TypeScript-related enhancements.
Betting on the wrong horse
Airtable’s codebase started out as vanilla JavaScript. Much ink has been spilled on the benefits of static typing; suffice it to say that we agree. When we investigated incremental typing for JavaScript in 2016, it was a two-horse race between Flow and TypeScript. We went with Flow, because at the time it had better support for React.
By 2019, it was clear that we had bet on the wrong horse. TypeScript’s development velocity had far outstripped Flow, and TypeScript had demonstrable advantages in terms of features, IDE support, and community resources. One of us (Caleb), who had previously worked on Flow at Facebook, took on the project of converting our codebase to TypeScript.
Guiding principles
At this point, Airtable’s codebase weighed in at over a million lines of JavaScript. Given the scale, there was a lot of complexity to unravel. We oriented the migration effort around three guiding principles:
- Don’t break the product. Always choose to preserve the semantics of existing code, to avoid introducing customer-facing issues.
- Don’t reduce type safety. Every individual change in the migration must increase type safety compared to Flow. There may still be lots of unsafe code, but any change should preserve or increase the type safety we previously had.
- Keep it simple. Migrating requires a lot of changes. Each individual change should be simple, and able to be reasoned about in the context of a single file. We can always follow up with smaller migrations for more complex transforms.
A big bang migration
Most TypeScript migrations are done incrementally: type-by-type and file-by-file. Since much of our codebase was already typed in Flow, we took a different approach: migrating the entire codebase to TypeScript at once, as a single big bang migration.
The first step was writing a codemod to perform purely mechanical transformations. There were existing codemods for converting Flow to TypeScript (example 1, example 2), but we wrote our own codemod with some additional features to meet our specific needs:
- Existing codemods didn’t change the module syntax. We first needed to convert CommonJS module syntax (
require()
andmodule.exports
) to ES Module syntax (import
andexport
). - Some existing codemods misinterpreted Flow features. For example they’d transform Flow’s cast expression
(x: T)
(which is covariant) to TypeScript’s cast expressionx as T
(which is bivariant) which is unsafe! Instead, our codemod uses a custom utility function likecast<T>(x)
implemented asfunction cast<T>(x: T): T { return x }
. - We also wanted special handling for some Airtable-specific idioms. For example, we commonly used types like
{[key: UserId]: string}
, even though TypeScript doesn’t support custom indexed access types. So we transformed these types toRecord<UserId, string>
instead of{[key: string]: string}
.
One of the most technically interesting (and unique) features of our codemod is how it handles unannotated function parameters. For example, consider the following example, which doesn’t specify a type for the parameter x
:
function f(x) {
return x * 2;
}
In this case, Flow is able to infer that x
is a number based on context. However, TypeScript will not, and in strict mode will throw an error.
Some of Airtable’s code leveraged this capability of Flow. Because one of our guiding principles was “Don’t reduce type safety”, annotating all these function parameters with any
was unacceptable. So, the codemod executes flow type-at-pos
to use Flow’s inferred type for every unannotated function parameter. As it turns out, most of the time Flow was inferring any
anyway!
As part of this blog post, we’re also happy to announce that we are open-sourcing our codemod! You can find it at github.com/Airtable/typescript-migration-codemod. If you’re interested in more deep technical details, we’ve also included an edited version of our internal documentation for all the changes. We hope that it is a useful reference for anyone doing a TypeScript migration, particularly if coming from Flow.
Rolling up our sleeves
The codemod performed the bulk of the necessary changes, modifying 3,300 files. However, this still left all of the changes that could not be handled automatically. At this point, running tsc
showed over 15,000 TypeScript errors spread across 1,600 files which required some degree of manual intervention.
Fortunately, we were able to rely on Caleb, our local type system expert. For about a week, he would come into work, sit down, and fix TypeScript errors. It was boring and tedious, but better than littering the codebase with // @ts-ignore
s. This is why our guiding principles mattered. We refused to regress on type safety for existing Flow code (remember: Don’t reduce type safety), but more aggressive refactors to improve type safety could also be dangerous (remember: Don’t break the product). Since not breaking the product was paramount, adding a // @ts-ignore
was sometimes the best solution.
All of this work was done on a separate branch. Once we had a branch that passed the typechecker and automated tests, it came time to land the changes on our main development branch.
Since it’s impossible to manually review a change that touches more than 1,600 files (and more than 48,000 modified lines), we instead used a combination of techniques:
- Documented the 14 automated transformations and 17 classes of manual transformations and asked engineers across the company to review that document.
- Assigned chunks of up to 10 files to be code reviewed by experts in the affected areas.
- Code reviewed the diff of compiled JavaScript bundles before and after the change. We didn’t change our compilation stack (Webpack and Babel) so there were only a few trivial changes to the compiled bundles.
On October 30, 2019 we locked our main branch, reran the automated codemods, and merged the TypeScript branch. We’ve been on TypeScript ever since.
Now that the dust has settled
Because one of the migration principles was “Keep it simple,” many ideas for additional improvement were deferred to future work. At the time, Caleb wrote a document with about twenty of these ideas. We’re proud that over the last two years, more than half of those ideas have been independently implemented by engineers across the company. Sharing just a few examples:
- In the code snippets we shared above, you can see how in early 2021 we finally converted our
createReactClass
components to React ES6 classes. This was a difficult undertaking because of our prevalent use ofcreateReactClass
mixins. That was done by a member of the developer efficiency team when they found out custom types forcreateReactClass
were significantly slowing down TypeScript build times. - A member of our Automations team wrote a helper method to generate a TypeScript type given a schema definition for our internal object schema validation framework. Previously, we would maintain both a schema and a TypeScript type for an object, which could lead to inconsistencies.
- A member of our Enterprise team converted all file extensions to
.tsx
. As a team, we reasoned that.ts
and.tsx
represented two dialects of the TypeScript language and decided we only wanted to write code in one dialect. - One of our founders enhanced our MySQL database access layer to return typed query results. They also upgraded all our
// @ts-ignore
comments to// @ts-expect-error
after TypeScript 3.9 was released. - A member of our Ecosystems team is using a codemod to prepare the codebase for enabling
--noUncheckedIndexAccess
, a new feature in TypeScript 4.1.
Conclusion
Low-code/no-code application development platforms are a secular trend, and we believe that Airtable will continue to innovate in this space in the decades to come. This means that most of Airtable’s engineers are yet to be hired, and most of Airtable’s code is yet to be written, by many orders of magnitude. These beliefs are why we take code quality seriously and continually evolve Airtable’s codebase.
The TypeScript migration was one of the largest refactors in the history of our codebase, and it will certainly not be the last. Engineers at Airtable are empowered (and encouraged!) to pursue projects like this, and we view the long-term health of our codebase as a responsibility shouldered by all of engineering.
If you join Airtable, we encourage you to look at the git history for the Expanded Record component we shared above. While some legacy references to jQuery still remain, working with that component, and the rest of our codebase, is much better today than if we didn’t continually choose progress.