Yury Michurin
Wix Engineering
Published in
4 min readJul 10, 2022

--

Fake CPU count for NodeJS in a container with this one trick!

Ever ran a node (or anything really) application inside a container on a large host with many CPUs and wanted that application to “think” it has far fewer CPUs?

Wix Beer Sheva Guild Day

That’s exactly the problem that we’ve discussed on our recent weekly Guild Day at Wix Beer Sheva.

For example, “faking” CPU count can be useful if you’re running a CI kind of load on your large host, each container running a build & tests for a different project, all of them using jest.

Maybe that’s your project? :) (Image by Juanjo Jaramillo)

Jest requests cpu information from node’s os module, as can be seen here. To simplify, it’s just:

require('os').cpus().length

On my Macbook, that outputs 4.

If we save this file as cpu-count.js, without changing it code, we can use this node trick to change cpu count result:

node -r './hack.js' cpu-count.js

in hack.js, we can simply:

require('os').cpus = () => [1,2];

However, that’s not that great, since the require cache, which is why this trick works, is only relevant to this process, if that node process spawns another node process, it’ll have it’s own require cache, and we’re back to the original cpu count.

We need to look deeper

Into the codes! (Image by Kevin Horvat)

So, how does node actually know how many cpus we have on our system?
Lets find out!

Strace to the rescue! For it’s man page:

strace is a useful diagnostic, instructional, and debugging tool.
System administrators, diagnosticians and trouble-shooters will
find it invaluable for solving problems with programs for which
the source is not readily available since they do not need to be
recompiled in order to trace them.

So, to see how node figuring out how many CPUs the system has, lets run:

strace node cpu-count.js

In the output, we can spot 2 interesting calls:

openat(AT_FDCWD, "/proc/stat", O_RDONLY|O_CLOEXEC) = 17
...
openat(AT_FDCWD, "/proc/cpuinfo", O_RDONLY|O_CLOEXEC) = 18

First of all, what is this /proc thingy? From the man page:

The proc filesystem is a pseudo-filesystem which provides an
interface to kernel data structures. It is commonly mounted at
/proc. Typically, it is mounted automatically by the system…

Lets search libuv’s source code for those two files, to confirm those are indeed used for the CPU information. Indeed, /proc/stat is read here and used for the CPU count and /proc/cpufinfo is read here for additional information about the CPUs.

So, if we fake those two files, we can make node, or any other application that looks at /proc for cpu information to think we have whatever CPUs we want.

It is written

Files go here (Image by Fabien Barral)

The simplest method, is just mount on top of /proc, here’s an example:

root@934cb48e6226:/# cp /proc/stat fake-stat # copy the file
root@934cb48e6226:/# vim fake-stat # edit it to remove CPU lines
root@934cb48e6226:/# mount --bind ./fake-stat /proc/stat # mount on top
root@934cb48e6226:/# node cpu-count.js
1

However, not that fast… For the above to work, the container should be launched with `— cap-add=sys_admin`, i.e:

docker run --cap-add=sys_admin -it ubuntu

Which is a bad idea™ in a non test env.

With this one simple trick…

So we can’t really change those two files on the /proc filesystem, but maybe we can change where node process is looking instead? 🤔

LD_PRELOAD to the rescue! From the man page:

A list of additional, user-specified, ELF shared objects
to be loaded before all others. This feature can be used
to selectively override functions in other shared objects.

In simple words, it’s an environment variable, that points to a shared object (.so file in linux, “dll” in windows) that can override functions in libc.

What we can do, is create a C library, that will “catch” the open function call, if the target is /proc/cpuinfo or /proc/stat, we’ll open the “fake” file instead.

#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <string.h>
typedef int (*orig_open_f_type)(const char *pathname, int flags);
orig_open_f_type orig_open64 = NULL;
int open64(const char *pathname, int flags) {
if (0 == strncmp(pathname, "/proc/cpuinfo", strlen(pathname))) {
return orig_open64("/fake-cpuinfo", flags);
}
if (0 == strncmp(pathname, "/proc/stat", strlen(pathname))) {
return orig_open64("/fake-stat", flags);
}
int fd = orig_open64(pathname, flags);
return fd;
}
__attribute__((constructor)) void init() {
printf("Intercepting open() calls\n");
if (orig_open64 == NULL) {
orig_open64 = (orig_open_f_type)dlsym(RTLD_NEXT, "open64");
}
}

To compile that, we’ll use:

gcc -shared -fPIC intercept.c -o intercept.so -ldl

To run:

LD_PRELOAD=./intercept.so node cpu-count.js

Same trick can also be used to “trick” other binaries about their environment, for example, we can use the code below to turn back time:

struct timeval {
unsigned long tv_sec; /* seconds */
unsigned long tv_usec; /* microseconds */
};
struct timezone {
int tz_minuteswest; /* minutes west of Greenwich */
int tz_dsttime; /* type of DST correction */
};
int gettimeofday(struct timeval * tv, struct timezone * tz) {
tv->tv_sec = 0;
tv->tv_usec = 0;
return 0;
}

or cheat binaries about the effective user (check that on whoami command):

int geteuid() {
return 0;
}

Epilogue

We’ve seen a few solutions for the CPU count problem in a container in context of a node application.
Every solution has it’s tradeoffs and its uses, as always, I advice to choose the one best suited for your unique use case.

--

--