From Slow to Swift: A Case Study on Enhancing Typescript Transpilation Efficiency

Akshay Jain
5 min readMay 27, 2023

Introduction

So we had a Typescript &Node.js project, where developers used to actively commit, adding new features.

All was going great until we observed out of the blue, the transpilation process using tsc was consuming a lot of RAM (exceeding 2GB at times!).

Since our server was low on memory, we failed to build the project on the EC2 server. Giving the following error:

<--- Last few GCs --->

[1454203:0x687ce90] 42670 ms: Scavenge (reduce) 970.6 (988.1) -> 969.7 (988.1) MB, 2.4 / 0.0 ms (average mu = 0.117, current mu = 0.076) allocation failure
[1454203:0x687ce90] 42718 ms: Scavenge (reduce) 970.6 (988.1) -> 969.8 (988.4) MB, 14.6 / 0.0 ms (average mu = 0.117, current mu = 0.076) allocation failure


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0xb0a860 node::Abort() [node]
2: 0xa1c193 node::FatalError(char const*, char const*) [node]
3: 0xcf9a6e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xcf9de7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
......
Aborted (core dumped)

This problem can be solved by increasing max space utilized by a node process using

NODE_OPTIONS='--max-old-space-size=<some_decent_memory, (2048)>' tsc --project ./

But once that “some decent memory”, exceeded the max RAM on the machine, it was not possible to transpile the code

As a workaround, we used to scp the build from our local machine to the server, but we did realized the problem had to be fixed.

In this blog, I’ll talk about how the problem was debugged, fixed, and the lessons learned.

Project Structure

The project I am talking about is a Node.js application written in Typescript, with the following directory structure.

└── Project/
├── src/
│ └── All TypeScript Files (*.ts)
├── node_modules/
├── dist/
│ └── All Javascript Files after build (*.js , *.js.map)
├── package.json
└── tsconfig.json

To build the project we used the tsc compiler, using the following command at the root of the project:

tsc --project ./

The above command looks into tsconfig.json at the specified path, transpile all (*.ts, *.d.ts) files in the src/ directory and the generated output of JavaScript files are stored inside the dist/ directory.

It was this build time which took excessive time and memory, despite the fact that the project was not too big!

Looking into Metrics

To check how many files, did tsc actually transpiled. The following command was used:

tsc -p ./ -extendedDiagnostics , it generated the following outcome, (only displaying relevant parts of output here)

Project
Files: 1751
Lines of Library: 27421
Lines of Definitions: 2545352
Lines of TypeScript: 7889
Lines of JavaScript: 0
Memory used: 1693571K
Assignability cache size: 15230
I/O Read time: 0.52s
Parse time: 17.36s
Program time: 18.73s
Bind time: 5.44s
Check time: 3.42s
Total time: 28.32s

Seeing the report, I knew immediately something was off.

Under no circumstances the project in question should be working through 1700+ files, whereas the source code written by developers contains ~150 files, without a lot of external modules.

Now before I get to the next part, I want to understand how tsc works with module imports. Such as:

//A Typical Import Statement
import {a} from "something"

The module something resides inside the node_module directory, during transpilation, if the imported module has associated TypeScript type definitions (usually provided through @types packages or (*.d.ts) files), the TypeScript compiler will use them to provide type information and ensure type safety.

It implies the compiler might use every (*).d.ts file available in the directory. Including some that might be not required and this was the source of our problems.

Pinpointing the problem

Since we knew the problem was too many files being read, the next step was to list the files and finding the culprit.

So we ran

tsc -p ./ -listFiles >> compiled-files.txt

The command tsc -p ./ -listFiles, lists all files being read, we wrote the output in a file compiled-files.txt to perform searches and operations on it.

The compiled-files.txt looks as follows

/<path_to_project>/src/index.ts
/<path_to_project>/src/sub_directory1/file1.ts
/<path_to_project>/node_modules/<package_1>/<sub_directory0>/index.d.ts
/<path_to_project>/node_modules/<package_2>/<sub_directory>/index.d.ts
/<path_to_project>/node_modules/<package_1>/<sub_directory1>/index.d.ts
/<path_to_project>/node_modules/<package_1>/<sub_directory2>/index.d.ts
/<path_to_project>/node_modules/<package_1>/<sub_directory>/index.d.ts
.... So on

We observed that a lot of files were being imported from node_modules , a quick grep -c "<path_prefix>" helped us identify how many files were from src directory and how many were from node_modules

The statistics looked as follows

src ts files: 125
node_modules: 1626

Now, we were pretty sure that some packages that we imported in our source code, contained a lot of (*.d.ts) files or type declarations that were not required. Our next task was to identify the culprit.

To get the list of the number of files read by each package we used the following command on the text file compiled-files.txt that we created.

awk -F '/' '{print $8}' compiled-files.txt | sort | uniq -c | sort -nr

The awk command will split the lines in the file on the /character and {print $8} , prints the 8th column.

Note: The 8th column for me represents the column in my path that contains the package name, for example if your path is /home/code/project/node_modules/package_name then the column for you will be 5th.

The sort command will sort the output of the awk command.

The uniq -ccommand will count the number of times each directory name appears. The sort -nr command will sort the output of the uniq -c command in reverse order.

The output was as follows, Drumrolls……

677 googleapis
396 aws-sdk
273 @slack
75 @types
57 bullmq
.... bunch or other libraries

Now that was our Eureka moment, we immediately realized that in our code we support OAuth by google, just for OAuth we import entire googleapis , as follows:

//Just to perform OAuth, all of googleapis were imported <face_palm_emoji>
import {google} from 'googleapis'

We re-wrote the OAuth part this time using google-auth-library and ran the same steps again to observe.

After refactoring by not using googleapis as a whole

Files: 1063
Lines of Library: 27421
Lines of Definitions: 856361
Lines of TypeScript: 7887
Memory used: 874829K
Assignability cache size: 15231
Parse time: 7.16s
Program time: 7.93s
Bind time: 2.67s
Check time: 3.68s
....
Total time: 14.92s

Just by refactoring googleapis we immediately say reduction in build time by half!

So Learnings?

Well as everyone always says, be careful of what you are importing. This is a prime example of what happens when you are not aware of what contains inside the package that you are importing.

The googleapis is a huge package that contains lots of submodules for sheets, drives, etc. The same goes for aws-sdk as well.

The npm docs suggest importing submodules, to reduce the startup time. Link to the relevant documentation

It is imperative for complex projects with huge and bundled dependencies, that developers import only the modules that are necessary, narrowing the scope as much as possible.

Else, the project can be plagued by the above problems.

To explore computer science concepts that we use in our everyday lives, be sure to check out my Medium page.

Additionally, I share my exciting work in building developer tools, SASS applications, and writing insightful blogs on Twitter and LinkedIn. Connect with me on these platforms to stay updated and join the engaging discussions.

Happy coding :)

--

--

Akshay Jain

Software engineer, fascinated by backend technologies. Goes by i-rebel-aj on most social platforms