Using Yarn to Package Large Node Projects with RPM

Nick Nathan
unified-engineering
5 min readNov 4, 2019
Photo by Victor Freitas on Unsplash

The Unified Platform, or “UP” for short, is an analytics and reporting tool which enables Unified’s customers to gain better insight into their social media spend and social media campaign performance. At a high level the UP is a Node application which serves one of several React apps. Each React app contains a set of distinct features such as custom reporting, spend analysis, or audience segmentation. Each React app has its own repo and all are loaded into the Node application as npm package dependencies. A more in depth overview of Unified Platform’s architecture can be found in a previous post. Every week, after each React app has been updated with the previous week’s new features and updates, the team releases a new version of the UP via our Jenkins deployment pipeline.

The Problem

One afternoon while the team was preparing to release the latest version of the UP, the phase in our Jenkins pipeline responsible for packaging the Node application starting failing unexpectedly. Our release pipeline consists primarily of three phases:

  1. Build - in this step each React app is browserified, uglified and then copied into a shared location for distribution to a client along with all other static assets.
  2. Package - in this step the entire Node project, including the Node server and all the child React apps, is bundled into a rpm package and uploaded to a yum repo.
  3. Deploy - in this step a puppet command is issued which tells a group of AWS EC2 instances to fetch the latest version of the Node project’s rpm from the yum repo, install it and then start the new Node service.

The second step, the package job, was failing inexplicably with an error like:

Requires(rpmlib): rpmlib(PayloadFilesHavePrefix) <= 4.0–1 rpmlib(CompressedFileNames) <= 3.0.4–1
error: Unable to create immutable header region.

What … ?

After a bit of googling the team found an interesting post by the BBC engineering team discussing a similar problem around packaging large Node projects using rpm. The post pointed us to an unresolved RedHat bug ticket which suggested that if there were too many files in the Node project then rpm would fail to package the app. The team at the BBC went on to solve the problem by splitting their Node app into multiple rpm packages and subsequently running multiple installations on their servers to avoid the file count limits. For Unified however this solution was not feasible. Not only would our dev ops team have to reimplement large pieces of our deployment pipeline to support some variable number of rpm packages but it would do nothing to address the root cause which was the application size.

After a bit of testing the team found that the file count limit for rpm seemed to exist somewhere around 150k files. Our Node application however was upwards of 165K files, most of which lived in the node_modules folder. In order to release any changes the team had to get the file count under the 150K threshold. We were able to reduce the file count partially by removing a few unused dependencies and features however this was not a long term solution. Inevitably the application would grow in size and complexity and new dependencies would be added. Eventually we would be unable to release with no fat to cut. Another solution was needed.

The Solution

After further examination the problem appeared to be rooted in the fact that every React app being served was loaded into the Node project complete with its own set of node_modules. A high degree of package redundancy was causing the file count size to explode without providing any real benefit. The file structure after running the build phase looked something like:

└── unified-platform
├── apps
│ ├── analytics-app
│ │ └── appEntry.jsx
│ └── reporting-app
│ └── appEntry.jsx
├── dist
│ ├── analytics-app
│ │ ├── app.js
│ │ ├── lib.js
│ │ └── main.css
│ └── reporting-app
│ ├── app.js
│ ├── lib.js
│ └── main.css
├── index.js
├── node_modules
│ └── @unified
│ ├── analytics-app
│ │ ├── dist
│ │ │ ├── app.css
│ │ │ └── app.js
│ │ ├── node_modules
│ │ ├── package.json
│ │ └── yarn.lock
│ └── reporting-app
│ ├── dist
│ │ ├── app.css
│ │ └── app.js
│ ├── node_modules
│ ├── package.json
│ └── yarn.lock
├── package.json
├── scripts
├── server
└── yarn.lock

The first thing we noticed is that the Unified specific npm packages, each representing a React app, had their OWN sets of node modules nested under the Node project’s node modules. It turns out that none of the React app specific dependences were being used because the actual app logic was prebuilt and saved to the dist/ folder of the React app and then copied into the dist/ folder of the Node app. The logic of our build scripts for each app is something like:

# Build wrapper components used to mount React app in the Node app
browserify ./apps/reporting-app/appEntry.jsx > ./dist/reporting-app/app-unminified.js
uglifyjs ./dist/reporting-app/app-unminified.js > ./dist/reporting_app/app.js
# Copy over prebuilt React app into dist/ folders
cp ./node_modules/@unified/reporting-app/dist/lib.js ./dist/reporting-app/lib.js

This was important because it meant that we could actually remove all of the Unified specific node package dependencies in the node_modules folder AFTER the app was built but BEFORE it was packaged to rpm. Therefore the solution was to simply add the following logic between the package and deploy steps of our release pipeline.

yarn remove @unified/reporting-app @unified/analytics-app

And voila! The total file count dropped by almost 40% with no impact to our application. Simply by uninstalling our custom npm modules before packaging we removed a large number of files that were never used at runtime. No need to update our Jenkins release pipeline to handle multiple rpm packages and no need to deprecate or remove any functionality. Despite being a frustrating problem to troubleshoot it proved to be a great opportunity to take a deep dive into rpm, yarn and our own release pipeline.

If you’re a developer looking to join a team working on interesting problems at the intersection of technology and marketing be sure to check out Unified at https://unified.com/about/careers-and-culture.

--

--

Nick Nathan
unified-engineering

Building apps and technical infrastructure for startups and growing businesses.