How To Secure Your Nodejs on Premise Application?

Most of the users won’t try to retro-engineer your application, but what if someone will? What can we do to be confident on our deliveries?

Baptiste Olivier

Published in

Kili Technology

5 min readSep 21, 2021

Why would we need to deploy on premise?

Nowadays, data privacy can prevent a deal from being closed. At Kili Technology, we let the user choose between 3 options:

use Kili cloud with data stored on Kili server
use Kili cloud with data stored on client servers
use a specific Kili instance deployed on premise in user infrastructure, with data stored on client servers

What does on premise deployment change in the way we build our application?

Deploying on user infrastructure means that we need to adapt to it. Some users may want to run Kili on a specific port whereas some other users want to authenticate using their SSO. We need some environment variables to specify this behavior to the backend.

On top of that, the user infrastructure may be different. We may deploy on a dedicated computer, or on a dedicated cluster. Our deployment may be on a private network without any internet connection, which means that everything must be present on the user side.

Based on those constraints, we chose to build docker images for each service, and use either Kubernetes or docker-compose to handle environment variable setup.

Example project

For the following parts, we will use a basic project, containing an express server with a secret value hardcoded, and another one passed with environment variables.

This project listens for requests at / and sends a HelloWorld if it passes validation, or a 403 else.

Basic build flow

Let’s take a simplified example of a CI job that runs a build process of a Nodejs service:

Npm run build is doing a basic typescript to javascript transpilation work. The entry point script adds some configurations and launch the application.
We already added environment variables to our code to be able to deploy on our cloud to multiple targets (production, QA, staging).

This build is enough for a cloud installation, but what happens if we put this into the hand of a user? What information can be extracted from this build?

What can be retro-engineered

When building a docker image, each instruction creates a new layer. (More information here).
When we push a docker image, only the latest layer is pushed, so intermediate layers are not directly available once this image is pulled from a different computer.

However, some instructions may allow access to files present only on some intermediate layers.
We need to change that to erase a part of our intermediate layers, using multi-stage build:

FROM node:14-slim as base# Copy source code 
WORKDIR /app
ADD . /app/
RUN npm install
RUN npm run buildFROM node:14-slimWORKDIR /app
COPY --from=base /app/dist /app/dist
COPY --from=base /app/node_modules /app/node_modules
COPY --from=base /app/package.json /app/  
ENTRYPOINT /app/entrypoint.sh start

By doing that, we ensure that only specified folders copied from the base stage are present in the final image delivered.

Clients will only have access to transpiled code and environment variables of delivered images.

Transpiled code

If we pay attention more closely to the code generated, we can find a lot of information.
The file structure is preserved, the code is understandable, and therefore a malicious user may try to modify it manually.

The /models/user.js file is almost the same :

Object.defineProperty(exports, "__esModule", { value: true });
exports.secret = exports.validUser = void 0;
exports.validUser = process.env.KILI__VALID_USER;
exports.secret = 'not-so-secret';
//# sourceMappingURL=user.js.map

We need to uglify everything more. We use uglify-folder in the npm run build2 configured : tsc --build tsconfig.json && uglify-folder dist js -eo dist.min -x .js && rm-r dist && mv dist-min dist

Transpiled code looks harder to modify, but it can still be modified and understandable. If we look at the code produced for /models/user.js, we got something similar:

Object.defineProperty(exports,"__esModule",{value:!0}),exports.secret=exports.validUser=void 0,exports.validUser=process.env.KILI__VALID_USER,exports.secret="not-so-secret";

In the end, producing javascript always exposes the built code to be modified, as nodejs will read it and interpret it.
If we want to be sure that the code can’t be modified and that secrets are safe, we need to compile our javascript into binary for Nodejs.

Pkg (https://github.com/vercel/pkg) is a Vercel tool that does exactly what we want. With a minimal configuration in package.json, we can transform our code into a binary application.

If we try the npm run build3 command, it will produce binaries files in the /bin folder based on the chosen targets.
Inspecting produced binary file reveal some string, corresponding to files declared as assets (the node_modules folder), but we can’t find any record about our secret values, or our application code.

The last build command adds an option to compress binaries using the Brotli compression algorithm. It significantly cut the built size by 2 (from ~100MB to ~50MB) but increased a lot build time.

The built images are big because we included all node_modules (~90MB). To do that better, we could do a multi-stage built, installing all deps and transpile into js first, then take the transpiled code without any dev-dependencies into a binary.

The last thing we need to verify is the environment variable we use.

Environment variable

As expressed before, we need environment variables to configure every deployment made on the user side. Environment variables could be defined while we build the docker image, or when we launch them in a docker-compose file.

This is a partial example of a docker-compose file we used to launch Kili :

version: "3"
services: 
  label-backend-v2:
    image: label-backend-v2
    restart: always
    entrypoint: ["/app/entrypoint.sh", "start"]
    environment:
      - AUTHENTICATION__OPENID_CONNECT_CLIENT_ID=
      - ENDPOINT__API_V2=https://cloud.kili-technology.com/api/label/v2
      - KILI__API_KEY=ee74454a-1bb2-9912-6145c-0caa1cc13211
      - KILI__LICENSE_PASSPHRASE=Ddof7sdkeERtt299eQsdfdffgdfqeza
      - NATS__ENDPOINT=nats://label-nats:4222
      - PORT__LABEL_BACKEND_V2=4000
    ports:
      - '5443:5443'

All environment variables are accessible for the user, and they may read them and modify them as they want. The first warning on this Dockerfile is about secret data. Secret data :

must not be stored plainly in a Dockerfile (https://docs.docker.com/compose/compose-file/compose-file-v3/#secrets)
must be uniquely generated for each customer
are accessible through a deployed container, meaning that the value can be read inside the running container

In the provided example, some of the exposed variables may be dangerous :

KILI__API_KEY, can be used by a user to do requests through the API as it was the service account
KILI__LICENSE_PASSPHRASE, can be used to cipher and decipher licenses in the database, and therefore allow users to modify their license

To store those values, we need to add them in the build process, obfuscating them in the binary as we do for the secret value we exposed.

Conclusion

This post explores some of the methods that can be used to protect a code of a service. Building it into a binary application doesn’t make our services 100% secured, as someone may interconnect malicious scripts between our services, proxying our calls to inspect or modify data on the fly.

In another blog post, we’ll see what can be done to cover those cases.

You are interested in this topic? We are a well-funded technology startup working on cutting-edge problems ranging from AI to software engineering. Come work with us by contacting angel@kili-technology.com!