Slamming Together Stacks and How it Reduced our AWS Costs by 75%

Published in

BACIC

7 min readAug 27, 2017

Remember moving out of your parent's house and having to begin paying for your own shit? We had a similar experience moving out of Cybera into AWS.

Our production was beginning to generate revenue, covering chunks of our burn, and after enjoying far beyond our initial allowance, we decided to do the honourable thing and offload our production nodes to AWS to free up free resources for new projects and researchers.

It's one thing to ballpark an estimate for server cost and quite another thing to get the first bill. We quickly realized that our sloppy habits had to come to an end.

No more leaving nodes on for experimentation, storing an unreasonable number of images, or using the larger servers just in case we get a surge of traffic.

In seeking ways to reduce costs, we were lucky to come across some sage advice which quickly made the largest dent. Simply put — combine all of our environments into a single stack and use host headers to differentiate which environment variables and database connections to use.

For context, we erected several servers as new environment deployments extremely rapidly in order to take advantage of time-sensitive business opportunities.

We deemed it an acceptable source of technical debt to further our business prospects (how foolish, but alas). While mostly automated, some configurations still needed to be set up manually, along with updating deployment pipelines and branding customization.

The initial setup for each new environment took 2–3 hours, and probably two additional weekly maintenance hours for each new environment (propagating infrastructure changes, manual tasks, swapping feature flags, etc…).

So what did it take to make the switch?

A metaphor I like to use when combining our environments is moving homeowners into an apartment complex.

Before the move, each homeowner had their own utilities and storage, they could assume any visitors arriving were meant for them.

On the other hand, once moved into an apartment complex, each tenant shares the utilities and storage space, and there needs to be a mechanism for visitors to get to the correct tenant.

Sharing Utilities

These utilities included everything from node modules to third-party resources to server installations.

It really helps to have a good inventory of what utilities get affected when making these changes. To be fair, something will likely slip through the cracks, so ensure that a thorough QA process is followed as well.

For example, we use Auth0 and Stripe in each one of our deployments. We needed to consider whether or not to create new clients for each tenant or simply allow each tenant to use the same clients for these services.

By default, we decided to have each tenant use the same Auth0 client but use different Stripe clients unless requested otherwise by our customers.

Shared Storage

A concern that our new multi-tenant customers and apartment dwellers have in common is whether their things in storage are safe and won’t be intermingled with other people’s stuff.

We all know that most apartments have dedicated storage behind lock and key, only available to the tenant and the building manager/owner (I hope). This is contained in the same building, and Tenants can rest assured that no one else will be sifting through their stuff.

Much in the same way, each of our customers can rest assured that their data is safe behind stringent security practices even through its stored with other customers’ data within the same infrastructure.

Our solution simply added a new database user with access to a new database for each new tenant (logical separation).

On the application side of things, we needed to initialize connections to each database when the application starts up. This turned out to be the most challenging part of the build for us because we took the wrong approach at first.

Our first attempt was to create a new database connection whenever a request was made (DO NOT DO THIS!).

Gaius Baltar knows what I mean

You will need to close connections manually, which significantly harms performance since you are initializing a network connection with each endpoint call.

The better way is to initialize each connection only once at the startup of the application and simply direct the endpoint to which connection to use based on the host header:

Create a new database connection file

var Env = require('../env.js'); //File Pulling in Environment Variables
var mongoose = require('mongoose');// DB Authentication Information
var dbUser = <database username>; //example_dot_com_user
var dbPass = Env.<name of password variable>; //Keep this value protected// DB Location Information
var dbHost = <tenant URL>; //example.com
var dbPath = <name of tenant's database>; //example_dot_coms_db
var dbPort = Env.dbPort; //usually 27017// DB Authentication Methods
var dbAuthMech = "SCRAM-SHA-1";
var dbAuthSrc = <name of admin database>; //"admin" by defaultvar dbAuth = "authMechanism=" + dbAuthMech + "&authSource=" + dbAuthSrc;// Build Connection String
var dbConnection = "mongodb://" + dbUser + ":" + dbPass + "@" + dbHost + "/" + dbPath + "?" + dbAuth;// Connection to DB
module.exports = mongoose.creatConnection(dbConnection)

Notice that mongoose.createConnection() is used rather than mongoose.connect(). Explore these links to get a better understanding of the difference here.

2. Create a model object file

const PaperLTS = {};
const LTS_connection = require('./path/to/connection_file');const Model1 = require('./path/to/model1');
const Model2 = require('./path/to/model2');
.
.
.
const ModelN = require('./path/to/modelN');PaperLTS.Model1 = LTS_connection.model('model1', Model1);
PaperLTS.Model2 = LTS_connection.model('model2', Model2);
.
.
.
PaperLTS.ModelN = LTS_connection.model('modelN', ModelN);module.exports = PaperLTS;

Note that if you add new objects to your application, you will also need to update each model object file as well.

3. Require the new model object file in app.js

const Tenant1Connection = require('./path/to/model1_object_file');
const Tenant2Connection = require('./path/to/model2_object_file');
.
.
.
const PaperLTSConnection = require('./path/to/LTS_model_object_file');

Mechanism to Direct Visitors:

Now here’s where the heart of the implementation is.

Once we have created the above configurations, the application needs to be able to direct users to use the correct database and the correct services. To accomplish this, we created a tenant middleware which is called at the beginning of every route:

_tenantMW: (req, res, next) => {
    return new Promise((resolveAll, rejectAll) => {
        //1. Issue Tenant Variables
        TenantLIB._selectTenant(req.headers.host)
        //2. Attach Tenant Variables to request
        .then((tenantVariables) => resolveAll({ tenantVariables: tenantVariables}))
        .catch((err) => rejectAll({err: err, msg: "Error - Obtaining Tenant Variables"}));
    })
    .then((ret) => {
        console.log("URL --------------> ", ret.tenantVariables.paperUrl);
        req.tenantVariables = ret.tenantVariables;
        next();
    })
    .catch((err) => {
        console.error(err);
        res.status(404);
        res.json({
            type: "ERR",
            msg: err.msg,
        });
    });
}

The function that Issues the Tenant Variables (_selectTenant) is simply a promise-wrapped switch case:

//Environment variables
const Env = require('../env.js'); //File Pulling in Environment Variablesconst tenantLib = {
    _selectTenant: (origin) => {
        return new Promise(function(resolve, reject){
            let tenantVariables = {};
            switch(origin) {
                case "https://secure.paperlts.com/static/":
                case "https://www.secure.paperlts.com/static/":
                case "https://secure.paperlts.com":
                case "https://www.secure.paperlts.com":
                    tenantVariables = {
                        paperUrl: "https://secure.paperlts.com",
                        paperDb: Env.<DB name>,
                        paperDbUser: Env.<DB User Variable name>,
                        paperDbPass: Env.<DB password variable>,
                        poweredBy: "paper LTS",
                        defaultLogoURL: <URL or file path to Logo>,
                        defaultImagePath: "path/to/image/folder/",
                        .
                        .
                        .
                        dbConnection: "PaperLTSConnection",
                    }
                    break;
                case "https://example.com":
                    .
                    .
                    .
                default:
                    reject("Origin Invalid");
                }
            }
            resolve(tenantVariables);
        })
    },
}module.exports = tenantLib;

We now have specific Tenant variables available in the request object for all endpoints.

The only thing left to consider is how we invoke Mongoose to take actions on specific databases.

Notice how the value for dbConnection is the same for the variable called in app.js. This is intentionally done in order to match the connection while calling any Mongoose methods.

In our design, we centralized all of our Mongoose methods into a single library file. Each method is also wrapped in a promise.

I highly recommend this structure because it allows your higher-level methods to compose Mongoose calls. This pattern also happened to make it really easy to implement our host header separation because we only had to rewrite one file:

//Environment variables
const Env = require('../env.js');//Aggregating Connections
const Tenant1Connection = require('./path/to/model1_object_file');
const Tenant2Connection = require('./path/to/model2_object_file');
.
.
.
const PaperLTSConnection = require('./path/to/LTS_model_object_file');const connectionList = {
    Tenant1Connection: Tenant1Connection,
    Tenant2Connection: Tenant2Connection,
    .
    .
    .
    PaperLTSConnection: PaperLTSConnection
}const mongoosePromises = {
    // Promise Based Find Method
    _dbFindAllPM: (database, collection, request, options) => {
        return new Promise(function (resolve, reject) {
            connectionList[database][collection].find(request, options, function (err, data) {
                if (err) {
                    console.log("Error Occured");
                    reject(err);
                } else if(!data) {
                    console.log("No Data Found");
                    reject(data);
                } else {
                    resolve(data);
                }
            });
        });
    },
.
.
.}module.exports = mongoosePromises;

This just shows one example of a mongoose method wrapped in a promise.

The database argument is fed the req.tenantVariable.dbConnection value, which connects it to the appropriate database.

The connection argument is fed the model to act on. Take the following example from an endpoint:

mongoosePromises._dbFindAllPM("PaperLTSConnection", "Contracts", {_status: "Completed", _creator: "Shane Fast"})

This will fetch all completed contracts for me in the paper LTS database. “PaperLTSConnections” would be generalized by using req.tenantVariables.dbConnection to fetch from the correct database depending on where the visitor is arriving from.

The Benefits

After implementing this strategy, we only have to add two new files and edit an additional two files for any new environment additions (plus some third-party configurations). This replaces provisioning additional servers, several manual configurations, and hours each week in maintenance and can shut down unused resources (heavily reducing our hosting costs).

After a few days of concentrated effort to implement, I recommend it to anyone running multiple databases or highly similar infrastructure.

Hopefully, this quick run down gives you a general idea about how you can reduce your infrastructure using coding solutions.

I do not necessarily argue that coding solutions are always superior to infrastructure solutions, but rather using the solution that is more appropriate to your team’s makeup.

In this case, this solution will be easier to maintain and add onto moving forward but might be different for other teams. Use whichever helps your team, users, wallet and sanity the most!

If you found this valuable or entertaining, please follow the blog, and I’ll continue to post more tech goodness. Thanks for reading!

Slamming Together Stacks and How it Reduced our AWS Costs by 75%

So what did it take to make the switch?

Sharing Utilities

Shared Storage

Written by Shane Fast