Making the cloud work for “the little guy”

Published in

State of Analytics

6 min readOct 10, 2017

Adding AWS to your analytics platform is a route chosen by organizations big and small. It’s particularly important for “the little guy” to get it right, however, as he has less time and money with which to make mistakes and retool. In this article, I share several lessons we at Slalom have learned from our experience delivering AWS projects for clients in the San Francisco Bay Area and beyond.

Slalom has helped design and build AWS environments for clients who have no knowledge of the cloud (nor a BI team, for that matter), and for clients with plenty of skill but simply no bandwidth. We’ve seen clients working in well-managed AWS environments and others operating by the skin of their teeth. Along the way, we’ve come to recognize themes across organizations that have highly functional environments, while also noting common mistakes that befall others.

Why the cloud?

Put simply, the cloud is the low risk option for organizations that need to get it right on the first try. For those small organizations with complex, high volume, and disparate data sources, the reasons we’ve seen for choosing the cloud (and AWS in particular) are fairly consistent:

Need for fast turn-around
Desire for minimal IT involvement and simple maintenance
Desire to see it in action before having to go all-in
Need for future scalability with minimal up-front commitment

AWS delivers on all of these priorities. It’s a great choice (if done right).

Optimizing for the little guy

The cloud allows “the little guy” to operate like a big dog, without the need for in-house hardware and server maintenance. That said, we’ve seen things get out of hand in the fast-moving world of startups and small businesses. As data volumes balloon and timelines are tight, an organization can easily fall into the expensive trap of just adding more processing power without taking the time to optimize design. While this can be an issue at an organization of any size, small businesses are particularly prone to it, and they tend to have less financial capacity to handle issues that have started to snowball. Thus, it’s critical that “the little guy” takes the time to build intelligently from the start.

Here are several areas where our clients run into scaling issues:

No data lifecycle management -- keeping more historical data than necessary
Using their analytics system to house all raw data, rather than just the data needed for analysis
Rushing through the security model
Ignoring disaster recovery

Preparing for scale

While these pitfalls are common, there are plenty of methods an organization can employ to avoid them. Here are a few considerations to get started on the right foot.

Define your archiving standards

How much data is enough? Yes, we all want to have ten years of history for every country across all metrics, but there is a trade-off of cost, query time, and maintenance overhead. A client keeping three years of raw data in their Redshift instance, for example, may only be doing YOY trending for a handful of high-level metrics. Creating an aggregated, multi-year analysis table and archiving 1 year of your raw data into S3 could reduce your cluster size by ~30%. Define a rolling time window where data will be archived (e.g., stored in flat files in S3) and give yourself, your team, and your wallet a break.

Design for the consumers of your data

Many customers consider how to get data into the system, but many don’t plan well enough how to get the insights out. High volumes of data alone don’t answer your questions and lead to long ETL processes, slow query response times, and high cost. Plan with the consumers of data and their burning questions in mind. Making decisions about what is relevant in the system versus what is not will become much simpler.

Imagine your system with 100 users

Who should have access to what? Should your sales data be visible to the entire company? Should your junior supply chain analyst have write access to your customer master? Define these boundaries sooner rather than later and save yourself from having to untangle the spaghetti of a security model later, and avoid PII breaches in the meantime.

Make disaster recovery a priority

What happens if Steve forgets that he is in Production when he drops that one critical table? Maybe the Dev environment can be used to replace it… when was the last time it was refreshed? 3 weeks ago? 2 months? We all want to believe we’ll be careful, and that if we can just get through this one implementation, then we’ll sort out our backup strategy. Do it now.

Consider resiliency

Can you or your customers live without a piece of your system until your DR strategy is implemented? If not, you need to plan for resiliency and engineer for failure. It’s easy to replicate AWS services across availability zones and regions. Load balancers can be utilized to stabilize your environment under heavy load. As a small business, you can build in the same reliability and quality of experience as the billion-dollar company you’re competing with. Design your system with the assumption that failures are a normal part of operations and employ the strength of AWS to catch you gracefully. Strong system up-time has become table stakes for most companies with an online presence.

Bang for your buck

A small business may generate nearly the same volume of data as a large enterprise, but they don’t have the same financial resources to sustain design inefficiencies. Thus, it’s important for small businesses to manage their environment carefully so that their costs remain manageable. Here are several things that are frequently overlooked, but could greatly reduce the cost of your implementation.

Use reserved instances for at least a portion of your DB cluster
Tagging resources:
AWS allows user to assign custom tags to resources (clusters, servers, etc.)
Tagging resources by department/function upon creation allows you to evaluate usage, track/forecast/assign costs, and address problem areas quickly (e.g., a runaway POC)
Project how your usage will scale & set up indicators to get ahead of excessive growth (record counts, DB size, etc.). Hold periodic check-ins to adjust your plan.

Force multiplication

Speaking of saving costs, we’ve often seen clients overlook the full scope of services available for managing their AWS instance. Instead, many organizations choose to build something in-house or, worse, to go without. In doing so, they place more administrative burden on their teams and often reduce the effectiveness and reliability of their platform. There are a host of managed services and 3rd party solutions built on AWS to simplify development and administration. If there isn’t a solution offered by AWS directly, check the AWS Marketplace for 3rd party vendor solutions. We recommend taking full advantage of these services to maintain a stable system and allow your engineers to focus on managing code and driving insights rather than the maintaining the server.

Do more with less using managed services & API calls:

Messaging/mail
Serverless architecture (Lambda)
Scaling
Alerting
ETL (Matillion)
Campaign services
3rd party vendors in the AWS marketplace

Where consultants fit in

While we strive to empower our clients as much as possible, there are instances where a quick, focused engagement from a consultant can save enormous amounts of time and money. As we discussed in the “Force Multiplication” section, for example, choosing the right tools that already exist can save development time and improve system reliability. Paying up front to have a knowledgeable advisor evaluate your system landscape and identify the most useful AWS tools is a good way to get moving fast and avoid having to backtrack.

There are a few stages in your cloud implementation effort where consultants can add value:

Getting started on the right foot (designing for scale)
Course correcting once you’re moving (identifying and preventing looming problems)
Retooling if you’ve gotten too big too fast (optimizing under-performant systems)

In summary

The cloud offers the opportunity for small players to stand up against incumbent forces like never before. Using the power of these new tools wisely will allow your business to offer a world-class customer experience, produce critical analytics, and scale as your footprint grows. We hope that these lessons, gained over the course of Slalom’s implementation experience, will serve your organization well.

Making the cloud work for “the little guy”

Written by Glen Dyszynski