Monorepo Insights: Nx, Turborepo, and PNPM (3/4)
Exploring the strengths and weaknesses of today’s top monorepo solutions
Explore the complete series through the links below:
- Monorepo Insights: Nx, Turborepo, and PNPM (1/4)
- Monorepo Insights: Nx, Turborepo, and PNPM (2/4)
- Monorepo Insights: Nx, Turborepo, and PNPM (3/4) (we are here)
- Monorepo Insights: Nx, Turborepo, and PNPM (4/4)
Introduction
This article is part of a series where comparing the features, performance, and suitability of Nx
, PNPM
, and Turborepo
for our projects.
Having laid the groundwork in monorepo management and build systems, and thoroughly investigated Nx
, we now turn our attention to uncovering the capabilities of Turborepo
. ✨
As a reminder, our goal is to select a champion who will streamline our development workflow and improve our codebase management.
May the best monorepo win! Here’s the challenge we’ll conquer together:
· Turborepo under the microscope
∘ Packages and Task Graph
∘ Local caching
∘ Remote caching
∘ Turborepo vs. Nx
· Hands-on Turborepo
∘ Turborepo Locally in Action
∘ Remote Caching in Action
· Technical verdict
∘ Our insights
∘ Real-World Insights: Turborepo in the Wild
· Conclusion
Curious about what’s next? Come along and let’s discover it together! 🚀
Turborepo under the microscope
Packages and Task Graph
🔳 Turborepo
utilizes both Package Graph and Task Graph to efficiently manage and execute tasks within a monorepo.
🔳 The Package Graph automatically maps out the relationships between internal package by analyzing their package.json
dependencies.
🔳 When referencing internal packages in package.json
you use the workspace-specific syntax, which is similar to how you reference external dependencies from NPM. Here’s an example:
// pnpm: ./apps/web/package.json
{
"dependencies": {
"@repo/ui": "workspace:*"
}
}
🔳 The Package Graph is the foundation of the Task Graph, defining how individual tasks depend on each other.
🔳 The Task Graph is a directed acyclic graph (DAG), where nodes represent tasks and edges represent dependencies. This ensures correct execution order and enables parallel task execution.
🔳 Task dependencies are explicitly defined in turbo.json
. Example:
// turbo.json
{
"tasks": {
"build": {
"dependsOn": ["^build"] // Task depends on build tasks in all dependent workspaces
}
}
}
🔳 For large projects, Turborepo
intelligently calculates the intricate project graph in the background using a daemon process, significantly reducing startup overhead.
🔳 Tasks are ordered for execution using topological sorting, ensuring dependencies are met.
Topological dependencies specify that your package’s dependencies should execute their tasks before your package executes its own task. — https://turbo.build/repo/docs/messages/missing-root-task-in-turbo-json#why-this-error-occurred
🔳 Turborepo
employs Tarjan’s algorithm to detect circular dependencies in the Task Graph, ensuring a smooth and error-free build process.
// https://github.com/vercel/turbo/blob/main/crates/turborepo-graph-utils/src/lib.rs#L29
// https://github.com/vercel/turbo/pull/5566/files
pub fn validate_graph<G: Display>(graph: &Graph<G, ()>) -> Result<(), Error> {
// This is equivalent to AcyclicGraph.Cycles from Go's dag library
let cycles_lines = petgraph::algo::tarjan_scc(&graph)
.into_iter()
.filter(|cycle| cycle.len() > 1)
.map(|cycle| {
let workspaces = cycle.into_iter().map(|id| graph.node_weight(id).unwrap());
format!("\t{}", workspaces.format(", "))
})
.join("\n");
// If any cycles were detected, return an error with the cycle details.
if !cycles_lines.is_empty() {
return Err(Error::CyclicDependencies(cycles_lines));
}
for edge in graph.edge_references() {
if edge.source() == edge.target() {
let node = graph
.node_weight(edge.source())
.expect("edge pointed to missing node");
return Err(Error::SelfDependency(node.to_string()));
}
}
Ok(())
}
🔳 Turborepo
uses a depth-first search (DFS) to identify which tasks are affected by changes, minimizing the need for unnecessary rebuilds and speeding up development cycles:
// https://github.com/vercel/turbo/blob/main/crates/turborepo-repository/src/package_graph/mod.rs#L371
match direction {
petgraph::Direction::Outgoing => depth_first_search(&self.graph, indices, visitor),
petgraph::Direction::Incoming => {
depth_first_search(Reversed(&self.graph), indices, visitor)
}
};
🔳 Turborepo
leverages the Floyd-Warshall algorithm to determine the shortest dependency chain for each task. This optimizes build times by minimizing redundant work.
// https://github.com/vercel/turbo/blob/main/crates/turborepo-lib/src/engine/mod.rs#L160
impl Engine<Built> {
/// Creates an instance of `Engine` that only contains tasks that depend on
/// tasks from a given package. This is useful for watch mode, where we
/// need to re-run only a portion of the task graph.
pub fn create_engine_for_subgraph(
&self,
changed_packages: &HashSet<PackageName>,
) -> Engine<Built> {
let entrypoint_indices: Vec<_> = changed_packages
.iter()
.flat_map(|pkg| self.package_tasks.get(pkg))
.flatten()
.collect();
// We reverse the graph because we want the *dependents* of entrypoint tasks
let mut reversed_graph = self.task_graph.clone();
reversed_graph.reverse();
// This is `O(V^3)`, so in theory a bottleneck. Running dijkstra's
// algorithm for each entrypoint task could potentially be faster.
let node_distances = petgraph::algo::floyd_warshall::floyd_warshall(&reversed_graph, |_| 1)
.expect("no negative cycles");
let new_graph = self.task_graph.filter_map(
|node_idx, node| {
if let TaskNode::Task(task) = &self.task_graph[node_idx] {
// We only want to include tasks that are not persistent
let def = self
.task_definitions
.get(task)
.expect("task should have definition");
if def.persistent {
return None;
}
}
...
💡 Pro Tip: To visualize the Task Graph, Turborepo offers the --graph <file type>
option to gain insights into the monorepo’s build structure:
turbo run build --graph
turbo run build test lint --graph=my-graph.svg
Let’s move on to discussing in depth cache mechanisms. 🌟
Local caching
🔳 Turborepo
hashes task inputs (including dependencies, environment variables, and source code) and stores the build outputs. This means if we haven’t changed anything, the task completes almost instantly!
🔳 Turborepo
caches results locally in the .turbo/cache
directory.
🔳 Turborepo
performs caching on two levels:
- Global: Detects changes that affect the entire repository.
- Task: Identifies changes specific to each task (incorporates the global hash for context).
💡 Turborepo
caches by default.
Remote caching
🔳 A remote and shared cache server (e.g., Vercel’s Remote Caching or a custom solution) is designated as a central repository for storing build outputs:
🔳 A unique hash representing all the task’s inputs (code, dependencies, environment, etc.) is computed by Turborepo
before running a task. This hash is then compared against those already stored in the remote cache.
To use these hashes for your cache, we store outputs of your task in a tar file, a special file type that compresses an entire folder into one file. We store that file in both the local filesystem and in Vercel Remote Cache, indexed by this task hash. — https://vercel.com/blog/finishing-turborepos-migration-from-go-to-rust#hashing-tasks-for-the-run-command
🔳 If a match is found, the cached artifact is downloaded in milliseconds, eliminating the need for task re-execution and saving valuable time.
🔳 If no match is found, the task is executed normally, and its output is then uploaded to the remote cache for future use by others.
When it comes time to run a task, we first produce this task hash, and check if it’s in either the filesystem or the Remote Cache. If it is, we restore the task’s outputs in milliseconds. Otherwise, we run the task and store the outputs in the cache for next time. — https://vercel.com/blog/finishing-turborepos-migration-from-go-to-rust#hashing-tasks-for-the-run-command
Let’s examine the difference between NX
and Turborepo
in detail to enhance our understanding. 🌟
Turborepo vs. Nx
🔳 Based on the insights of monorepo.tools and what we have seen before, we can deduce a complete comparison of NX
and Turborepo
:
🔳 NX:
- Comprehensive monorepo management
- Strict tooling consistency
- Built-in code generation & plugins
- More project control
- Steeper learning curve
🔳 Turborepo:
- Lightning-fast builds
- Flexible tooling
- Simpler to learn
- Vercel integration
- Lacks built-in code generation & plugins
📌 NX offers a full-featured monorepo experience with more control, while Turborepo
prioritizes build speed and simplicity.
📌 Choosing the optimal tool for the monorepo is a nuanced decision, not a one-size-fits-all answer. The right fit between Nx
and Turborepo
requires a thorough assessment of project scale and complexity, team structure and collaboration, desired level of control, performance requirements, and additional tooling needs.
Now that we have delved into the fundamentals of Turborepo
, it’s time to put our knowledge into practice. 🚀
Hands-on Turborepo
Turborepo Locally in Action
✔️ turbo
can be installed both globally or locally:
pnpm install turbo --global
pnpm add turbo --save-dev --ignore-workspace-root-check
✔️ We can start from scratch, or using an example:
% npx create-turbo@latest
Need to install the following packages:
create-turbo@2.0.6
Ok to proceed? (y)
>>> Creating a new Turborepo with:
Application packages
- apps/docs
- apps/web
Library packages
- packages/eslint-config
- packages/typescript-config
- packages/ui
>>> Success! Your new Turborepo is ready.
To get started:
- Enable Remote Caching (recommended): pnpm dlx turbo login
- Learn more: https://turbo.build/repo/remote-cache
- Run commands with Turborepo:
- pnpm run build: Build all apps and packages
- pnpm run dev: Develop all apps and packages
- pnpm run lint: Lint all apps and packages
- Run a command twice to hit cache
✔️ The workspace anatomy is as follows:
You can see that Turborepo
is added on top of a workspace manager, in our case PNPM (pnpm-workspace.yaml
):
packages:
- "apps/*"
- "packages/*"
The purpose of Turborepo
is to manage tasks related to monorepo, which provides more flexibility and specialized role (turbo.json
):
{
"$schema": "https://turbo.build/schema.json",
"tasks": {
"build": {
"dependsOn": ["^build"],
"inputs": ["$TURBO_DEFAULT$", ".env*"],
"outputs": [".next/**", "!.next/cache/**"]
},
"lint": {
"dependsOn": ["^lint"]
},
"dev": {
"cache": false,
"persistent": true
}
}
}
We can also see the default generation of Typescript (tsconfig.json
) and ESLint (eslint-config
).
The
eslint-config-turbo
package helps you find environment variables that are used in your code that are not a part of Turborepo's hashing. Environment variables used in your source code that are not accounted for inturbo.json
will be highlighted in your editor and errors will show as ESLint output. — https://turbo.build/repo/docs/reference/eslint-config-turbo
✔️ To modify and run a task with Turborepo
:
// main package.json
"scripts": {
"build": "turbo build",
"dev": "turbo dev",
"lint": "turbo lint",
"format": "prettier --write \"**/*.{ts,tsx,md}\""
},
Then: pnpm dev
, pnpm build
, pnpm lint
, pnpm format
, etc.
✔️ The dependency graph of the workspace can be visualized using:
turbo run build --graph=my-graph.svg
Which creates this SVG
:
Although it’s not as advanced and interactive as NX
, I still find Turborepo
to be easy to use.
Let’s proceed with configuring remote caching. 📡
Remote Caching in Action
✔️ In order to connect our local Turborepo
to the Remote Cache, we must authenticate the Turborepo
CLI with our Vercel account:
turbo login
turbo login --sso-team=team-name
This is similar to NPM when we want to log in to publish a package. 😉
✔️ Once we have been authenticated, we must execute the link command:
turbo link
The cache artifacts will now be stored both locally and in the remote cache.
Then, run the same build again. If things are working properly,
turbo
should not execute tasks locally. Instead, it will download the logs and artifacts from your Remote Cache and replay them back to you. — https://turbo.build/repo/docs/core-concepts/remote-caching
✔️ It’s also possible to use another remote cache hosting provider instead of Vercel:
turbo login --api="https://my-server.example.com/api"
turbo link --api="https://my-server-example.com"
turbo run build --api="https://my-server.example.com" --token="xxxxxxxxxxxxxxxxx"
You can find the OpenAPI specification for the API here.
That’s it, it’s simple to handle Turborepo
! It’s time for us to express our opinion on everything we’ve seen and studied about Turborepo
.📌
Technical verdict
Our insights
🔳 Turborepo excels in:
- Build times for JavaScript and TypeScript monorepos are optimized.
- Monorepo setup and configuration are simplified.
- Teams are empowered with efficient shared caching.
- Smooth integration with Vercel’s ecosystem is achieved.
🔳 Turborepo should be considered if:
- Build speed and developer productivity are the top priorities.
- A more streamlined, focused toolset is preferred.
- The project primarily uses JavaScript or TypeScript.
🔳 Turborepo might not be the optimal solution for projects that:
- Demand a wide range of built-in features beyond the build process itself.
- Require strong enforcement of boundaries between projects.
- Encompass multiple programming languages within the monorepo.
For a well-rounded perspective on Turborepo
, let’s explore its real- usage in various projects and gather insights from the community who have put it to the test.🔆
Real-World Insights: Turborepo in the Wild
Turborepo’s popularity isn’t just hype; it’s proving its mettle in the real world, rapidly gaining traction across diverse development landscapes.
🔳 Many prominent companies have integrated Turborepo
into their production workflows, including:
- Netflix.
- Datadog.
- Astro.
- Egghead.
- Dito.
The complete list can be found here.
🔳 Turborepo’s speed, ease of use and streamlined setup are frequently mentioned:
🔳 Lastly, here are some comparative studies:
The feedback from the community reinforces our findings: Turborepo
is a game-changer for development teams seeking faster, more efficient, and reliable builds. Though it might not address every project’s needs, its emphasis on speed and simplicity positions it as a strong contender in the monorepo landscape. ♨️
Our deep dive into Turborepo
concludes here, but the monorepo adventure is just beginning! Get ready for an exciting exploration of pnpm workspaces, where we’ll dissect its strengths, weaknesses, and perfect use cases compared to Nx and Turborepo
. Stay tuned for more monorepo magic! ✨
Conclusion
Turborepo
is a powerful and efficient build system that is specifically tailored for JavaScript and TypeScript monorepos. Its intelligent task management, innovative caching mechanisms, and focus on raw speed have made it a game-changer for developers seeking streamlined workflows and rapid feedback loops.
Real-world adoption by industry leaders like Netflix, Datadog, and numerous open-source projects further solidifies Turborepo’s effectiveness and scalability. The vibrant community’s positive feedback underscores its intuitive setup, significant performance improvements, and overall positive developer experience.
Turborepo
may not be ideal for every project, but its simplified approach and focus on build speed make it an attractive option for many, particularly those that prioritize performance and employ JavaScript or TypeScript.
The monorepo journey doesn’t end here! Join us next time as we delve into the realm of pnpm
workspaces. Is Turborepo
really needed if we already use pnpm
?
Until then, stay curious, keep building, and embrace the ever-evolving world of modern development! ❤️
Thank you for reading my article.
Want to Connect?
You can find me at GitHub: https://github.com/helabenkhalfallah