Monorepo Insights: Nx, Turborepo, and PNPM (3/4)

Héla Ben Khalfallah
ekino-france
Published in
11 min readJul 19, 2024

Exploring the strengths and weaknesses of today’s top monorepo solutions

Monorepo Mosaic: Harmony in Code (Image licenced to the author)

Explore the complete series through the links below:

Introduction

This article is part of a series where comparing the features, performance, and suitability of Nx, PNPM, and Turborepo for our projects.

Having laid the groundwork in monorepo management and build systems, and thoroughly investigated Nx, we now turn our attention to uncovering the capabilities of Turborepo. ✨

As a reminder, our goal is to select a champion who will streamline our development workflow and improve our codebase management.

May the best monorepo win! Here’s the challenge we’ll conquer together:
· Turborepo under the microscope
Packages and Task Graph
Local caching
Remote caching
Turborepo vs. Nx
· Hands-on Turborepo
Turborepo Locally in Action
Remote Caching in Action
· Technical verdict
Our insights
Real-World Insights: Turborepo in the Wild
· Conclusion

Curious about what’s next? Come along and let’s discover it together! 🚀

Turborepo under the microscope

Packages and Task Graph

🔳 Turborepo utilizes both Package Graph and Task Graph to efficiently manage and execute tasks within a monorepo.

🔳 The Package Graph automatically maps out the relationships between internal package by analyzing their package.json dependencies.

https://vercel.com/blog/finishing-turborepos-migration-from-go-to-rust

🔳 When referencing internal packages in package.json you use the workspace-specific syntax, which is similar to how you reference external dependencies from NPM. Here’s an example:

// pnpm: ./apps/web/package.json
{
"dependencies": {
"@repo/ui": "workspace:*"
}
}

🔳 The Package Graph is the foundation of the Task Graph, defining how individual tasks depend on each other.

🔳 The Task Graph is a directed acyclic graph (DAG), where nodes represent tasks and edges represent dependencies. This ensures correct execution order and enables parallel task execution.

https://ogzhanolguncu.com/blog/monorepo-with-turborepo/
https://www.maxpou.fr/blog/turborepo/

🔳 Task dependencies are explicitly defined in turbo.json. Example:

// turbo.json
{
"tasks": {
"build": {
"dependsOn": ["^build"] // Task depends on build tasks in all dependent workspaces
}
}
}
https://turbo.build/repo/docs/core-concepts/package-and-task-graph#package-graph

🔳 For large projects, Turborepo intelligently calculates the intricate project graph in the background using a daemon process, significantly reducing startup overhead.

🔳 Tasks are ordered for execution using topological sorting, ensuring dependencies are met.

Topological dependencies specify that your package’s dependencies should execute their tasks before your package executes its own task. — https://turbo.build/repo/docs/messages/missing-root-task-in-turbo-json#why-this-error-occurred

🔳 Turborepo employs Tarjan’s algorithm to detect circular dependencies in the Task Graph, ensuring a smooth and error-free build process.

// https://github.com/vercel/turbo/blob/main/crates/turborepo-graph-utils/src/lib.rs#L29
// https://github.com/vercel/turbo/pull/5566/files

pub fn validate_graph<G: Display>(graph: &Graph<G, ()>) -> Result<(), Error> {
// This is equivalent to AcyclicGraph.Cycles from Go's dag library
let cycles_lines = petgraph::algo::tarjan_scc(&graph)
.into_iter()
.filter(|cycle| cycle.len() > 1)
.map(|cycle| {
let workspaces = cycle.into_iter().map(|id| graph.node_weight(id).unwrap());
format!("\t{}", workspaces.format(", "))
})
.join("\n");

// If any cycles were detected, return an error with the cycle details.
if !cycles_lines.is_empty() {
return Err(Error::CyclicDependencies(cycles_lines));
}

for edge in graph.edge_references() {
if edge.source() == edge.target() {
let node = graph
.node_weight(edge.source())
.expect("edge pointed to missing node");
return Err(Error::SelfDependency(node.to_string()));
}
}

Ok(())
}

🔳 Turborepo uses a depth-first search (DFS) to identify which tasks are affected by changes, minimizing the need for unnecessary rebuilds and speeding up development cycles:

// https://github.com/vercel/turbo/blob/main/crates/turborepo-repository/src/package_graph/mod.rs#L371

match direction {
petgraph::Direction::Outgoing => depth_first_search(&self.graph, indices, visitor),
petgraph::Direction::Incoming => {
depth_first_search(Reversed(&self.graph), indices, visitor)
}
};

🔳 Turborepo leverages the Floyd-Warshall algorithm to determine the shortest dependency chain for each task. This optimizes build times by minimizing redundant work.

// https://github.com/vercel/turbo/blob/main/crates/turborepo-lib/src/engine/mod.rs#L160
impl Engine<Built> {
/// Creates an instance of `Engine` that only contains tasks that depend on
/// tasks from a given package. This is useful for watch mode, where we
/// need to re-run only a portion of the task graph.
pub fn create_engine_for_subgraph(
&self,
changed_packages: &HashSet<PackageName>,
) -> Engine<Built> {
let entrypoint_indices: Vec<_> = changed_packages
.iter()
.flat_map(|pkg| self.package_tasks.get(pkg))
.flatten()
.collect();

// We reverse the graph because we want the *dependents* of entrypoint tasks
let mut reversed_graph = self.task_graph.clone();
reversed_graph.reverse();

// This is `O(V^3)`, so in theory a bottleneck. Running dijkstra's
// algorithm for each entrypoint task could potentially be faster.
let node_distances = petgraph::algo::floyd_warshall::floyd_warshall(&reversed_graph, |_| 1)
.expect("no negative cycles");

let new_graph = self.task_graph.filter_map(
|node_idx, node| {
if let TaskNode::Task(task) = &self.task_graph[node_idx] {
// We only want to include tasks that are not persistent
let def = self
.task_definitions
.get(task)
.expect("task should have definition");

if def.persistent {
return None;
}
}

...

💡 Pro Tip: To visualize the Task Graph, Turborepo offers the --graph <file type> option to gain insights into the monorepo’s build structure:

turbo run build --graph
turbo run build test lint --graph=my-graph.svg

Let’s move on to discussing in depth cache mechanisms. 🌟

Local caching

🔳 Turborepo hashes task inputs (including dependencies, environment variables, and source code) and stores the build outputs. This means if we haven’t changed anything, the task completes almost instantly!

https://vercel.com/blog/finishing-turborepos-migration-from-go-to-rust#hashing-tasks-for-the-run-command

🔳 Turborepo caches results locally in the .turbo/cache directory.

🔳 Turborepo performs caching on two levels:

  • Global: Detects changes that affect the entire repository.
  • Task: Identifies changes specific to each task (incorporates the global hash for context).
https://turbo.build/repo/docs/crafting-your-repository/caching

💡 Turborepo caches by default.

Remote caching

🔳 A remote and shared cache server (e.g., Vercel’s Remote Caching or a custom solution) is designated as a central repository for storing build outputs:

https://turbo.build/repo/docs/core-concepts/remote-caching#a-single-shared-cache

🔳 A unique hash representing all the task’s inputs (code, dependencies, environment, etc.) is computed by Turborepo before running a task. This hash is then compared against those already stored in the remote cache.

To use these hashes for your cache, we store outputs of your task in a tar file, a special file type that compresses an entire folder into one file. We store that file in both the local filesystem and in Vercel Remote Cache, indexed by this task hash. — https://vercel.com/blog/finishing-turborepos-migration-from-go-to-rust#hashing-tasks-for-the-run-command

🔳 If a match is found, the cached artifact is downloaded in milliseconds, eliminating the need for task re-execution and saving valuable time.

🔳 If no match is found, the task is executed normally, and its output is then uploaded to the remote cache for future use by others.

When it comes time to run a task, we first produce this task hash, and check if it’s in either the filesystem or the Remote Cache. If it is, we restore the task’s outputs in milliseconds. Otherwise, we run the task and store the outputs in the cache for next time. — https://vercel.com/blog/finishing-turborepos-migration-from-go-to-rust#hashing-tasks-for-the-run-command

Let’s examine the difference between NX and Turborepo in detail to enhance our understanding. 🌟

Turborepo vs. Nx

🔳 Based on the insights of monorepo.tools and what we have seen before, we can deduce a complete comparison of NX and Turborepo:

Turborepo vs. Nx (Image by the author)

🔳 NX:

  • Comprehensive monorepo management
  • Strict tooling consistency
  • Built-in code generation & plugins
  • More project control
  • Steeper learning curve

🔳 Turborepo:

  • Lightning-fast builds
  • Flexible tooling
  • Simpler to learn
  • Vercel integration
  • Lacks built-in code generation & plugins

📌 NX offers a full-featured monorepo experience with more control, while Turborepo prioritizes build speed and simplicity.

📌 Choosing the optimal tool for the monorepo is a nuanced decision, not a one-size-fits-all answer. The right fit between Nx and Turborepo requires a thorough assessment of project scale and complexity, team structure and collaboration, desired level of control, performance requirements, and additional tooling needs.

Now that we have delved into the fundamentals of Turborepo, it’s time to put our knowledge into practice. 🚀

Hands-on Turborepo

Turborepo Locally in Action

✔️ turbo can be installed both globally or locally:

pnpm install turbo --global
pnpm add turbo --save-dev --ignore-workspace-root-check

✔️ We can start from scratch, or using an example:

% npx create-turbo@latest

Need to install the following packages:
create-turbo@2.0.6
Ok to proceed? (y)

>>> Creating a new Turborepo with:

Application packages
- apps/docs
- apps/web

Library packages
- packages/eslint-config
- packages/typescript-config
- packages/ui

>>> Success! Your new Turborepo is ready.

To get started:
- Enable Remote Caching (recommended): pnpm dlx turbo login
- Learn more: https://turbo.build/repo/remote-cache

- Run commands with Turborepo:
- pnpm run build: Build all apps and packages
- pnpm run dev: Develop all apps and packages
- pnpm run lint: Lint all apps and packages
- Run a command twice to hit cache

✔️ The workspace anatomy is as follows:

Turborepo workspace anatomy (Image by the author)

You can see that Turborepo is added on top of a workspace manager, in our case PNPM (pnpm-workspace.yaml):

packages:
- "apps/*"
- "packages/*"

The purpose of Turborepo is to manage tasks related to monorepo, which provides more flexibility and specialized role (turbo.json):

{
"$schema": "https://turbo.build/schema.json",
"tasks": {
"build": {
"dependsOn": ["^build"],
"inputs": ["$TURBO_DEFAULT$", ".env*"],
"outputs": [".next/**", "!.next/cache/**"]
},
"lint": {
"dependsOn": ["^lint"]
},
"dev": {
"cache": false,
"persistent": true
}
}
}

We can also see the default generation of Typescript (tsconfig.json) and ESLint (eslint-config).

The eslint-config-turbo package helps you find environment variables that are used in your code that are not a part of Turborepo's hashing. Environment variables used in your source code that are not accounted for in turbo.json will be highlighted in your editor and errors will show as ESLint output. — https://turbo.build/repo/docs/reference/eslint-config-turbo

✔️ To modify and run a task with Turborepo:

// main package.json
"scripts": {
"build": "turbo build",
"dev": "turbo dev",
"lint": "turbo lint",
"format": "prettier --write \"**/*.{ts,tsx,md}\""
},

Then: pnpm dev , pnpm build, pnpm lint, pnpm format, etc.

✔️ The dependency graph of the workspace can be visualized using:

turbo run build --graph=my-graph.svg 

Which creates this SVG:

Turborepo dependency graph (Image by the author)

Although it’s not as advanced and interactive as NX, I still find Turborepo to be easy to use.

Let’s proceed with configuring remote caching. 📡

Remote Caching in Action

✔️ In order to connect our local Turborepo to the Remote Cache, we must authenticate the Turborepo CLI with our Vercel account:

turbo login
turbo login --sso-team=team-name

This is similar to NPM when we want to log in to publish a package. 😉

✔️ Once we have been authenticated, we must execute the link command:

turbo link

The cache artifacts will now be stored both locally and in the remote cache.

Then, run the same build again. If things are working properly, turbo should not execute tasks locally. Instead, it will download the logs and artifacts from your Remote Cache and replay them back to you. — https://turbo.build/repo/docs/core-concepts/remote-caching

✔️ It’s also possible to use another remote cache hosting provider instead of Vercel:

turbo login --api="https://my-server.example.com/api"
turbo link --api="https://my-server-example.com"
turbo run build --api="https://my-server.example.com" --token="xxxxxxxxxxxxxxxxx"

You can find the OpenAPI specification for the API here.

That’s it, it’s simple to handle Turborepo! It’s time for us to express our opinion on everything we’ve seen and studied about Turborepo.📌

Technical verdict

Our insights

🔳 Turborepo excels in:

  • Build times for JavaScript and TypeScript monorepos are optimized.
  • Monorepo setup and configuration are simplified.
  • Teams are empowered with efficient shared caching.
  • Smooth integration with Vercel’s ecosystem is achieved.

🔳 Turborepo should be considered if:

  • Build speed and developer productivity are the top priorities.
  • A more streamlined, focused toolset is preferred.
  • The project primarily uses JavaScript or TypeScript.

🔳 Turborepo might not be the optimal solution for projects that:

  • Demand a wide range of built-in features beyond the build process itself.
  • Require strong enforcement of boundaries between projects.
  • Encompass multiple programming languages within the monorepo.

For a well-rounded perspective on Turborepo, let’s explore its real- usage in various projects and gather insights from the community who have put it to the test.🔆

Real-World Insights: Turborepo in the Wild

Turborepo’s popularity isn’t just hype; it’s proving its mettle in the real world, rapidly gaining traction across diverse development landscapes.

🔳 Many prominent companies have integrated Turborepo into their production workflows, including:

  • Netflix.
  • Datadog.
  • Astro.
  • Egghead.
  • Dito.

The complete list can be found here.

🔳 Turborepo’s speed, ease of use and streamlined setup are frequently mentioned:

https://www.reddit.com/r/reactjs/comments/yhzf3f/comment/iugl6j3/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
https://www.reddit.com/r/reactjs/comments/yhzf3f/comment/iuh76tv/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
https://www.reddit.com/r/reactjs/comments/yhzf3f/comment/iuhrswn/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

🔳 Lastly, here are some comparative studies:

The feedback from the community reinforces our findings: Turborepo is a game-changer for development teams seeking faster, more efficient, and reliable builds. Though it might not address every project’s needs, its emphasis on speed and simplicity positions it as a strong contender in the monorepo landscape. ♨️

Our deep dive into Turborepo concludes here, but the monorepo adventure is just beginning! Get ready for an exciting exploration of pnpm workspaces, where we’ll dissect its strengths, weaknesses, and perfect use cases compared to Nx and Turborepo. Stay tuned for more monorepo magic! ✨

Conclusion

Turborepo is a powerful and efficient build system that is specifically tailored for JavaScript and TypeScript monorepos. Its intelligent task management, innovative caching mechanisms, and focus on raw speed have made it a game-changer for developers seeking streamlined workflows and rapid feedback loops.

Real-world adoption by industry leaders like Netflix, Datadog, and numerous open-source projects further solidifies Turborepo’s effectiveness and scalability. The vibrant community’s positive feedback underscores its intuitive setup, significant performance improvements, and overall positive developer experience.

Turborepo may not be ideal for every project, but its simplified approach and focus on build speed make it an attractive option for many, particularly those that prioritize performance and employ JavaScript or TypeScript.

The monorepo journey doesn’t end here! Join us next time as we delve into the realm of pnpm workspaces. Is Turborepo really needed if we already use pnpm?

Until then, stay curious, keep building, and embrace the ever-evolving world of modern development! ❤️

Thank you for reading my article.

Want to Connect? 
You can find me at GitHub: https://github.com/helabenkhalfallah

--

--

Héla Ben Khalfallah
ekino-france

Hi! I'm Hela Ben Khalfallah, Senior Frontend Expert specializing in JavaScript, React, Node.js, software architecture, and FrontendOps.