Your analytics platform gone rogue, Part 1: Unforeseen costs
“The costs of my data analytics platform are higher than expected!”
Introducing the “Your data analytics platform gone rogue” series
The series “Your data analytics platform gone rogue” aims to provide guidance on designing a strategy for architecting a data analytics platform by discussing things that can go wrong in the process of building or maintaining the platform. Potentially troublesome scenarios that can become reality when you start building a data analytics environment, or already have it running in production, provide the foundation for the discussion. The series dives into what you can do in the early stages of creating a data analytics platform, such as designing the architecture, creating an effective workflow, and establishing best practices, to lower the chances of unwanted situations.
This first entry addresses the scenario “The costs of my data analytics platform are higher than expected!” I start by introducing the type of data analytics platform that I am referring to throughout the series. Then I share my thoughts on preventing unexpected platform costs. Please keep in mind this article reflects my own opinion based on my work in this area and does not necessarily reflect the opinions of my employer, Microsoft.
What is a modern data analytics platform?
Let’s start with a quick explanation of how a data analytics platform is defined in this article. In general, a data analytics platform is a solution for collecting, organizing, and making sense of data. The need for an analytics platform is present in most large or data-driven companies that have many sources generating large amounts of data. This data is often critical to keep business processes running. In addition, understanding this data can provide indispensable insights for decision-making: It informs the basis for Machine Learning and AI to tap into their potential for predictive analytics.
A data analytics platform aims to present a solution to these requirements. Many companies offer solutions for a complete analytics platform or building blocks for customized platforms that operate either in the cloud or on-premises. Choosing among these options depends on specific needs and particular IT environments.
This article is intended to provide context to help you build your own analytics solution with cloud services; however, some learnings might be relevant in different contexts, too. Creating a modern analytics platform with cloud technology often aligns with a strategy of building a more efficient and future-proof data environment. The advantages of implementing a data analytics platform in the cloud are multifold, such as improved cost management, automation, and scalability. These advantages, among others, are currently driving IT teams to investigate cloud analytics solutions that work for their needs.
The image below shows a commonly used architecture for building a data analytics platform with cloud technology.
For this article, it is important to understand that this architecture describes a process commonly used by data engineers to create an environment that meets a company’s data needs. Data engineers must decide on the types of sources and documents to include in the platform, how to get them into the data environment, the form in which the data will be stored, the kinds of transformations and preparations necessary to analyze the data, and how the data will be analyzed and served to external applications.
Aside from decisions related to the illustrated architecture, many additional details must be considered when designing a solution that is robust, efficient, and future-proof — and these can be easily overlooked. It is exactly this lack of consideration that often causes problems. The famous quote often ascribed to Benjamin Franklin, “By failing to prepare, you are preparing to fail” couldn’t fit this context any better.
Preventing unexpected platform costs
Many IT teams are hesitant to change their environment because they feel anxious about scenarios in which a new solution doesn’t deliver the anticipated promises. Next to technical capabilities, the costs of a solution play a critical role in product adoption. Usually, an IT team has some idea of the costs associated with a new analytics platform. Situations in which costs are higher than expected often arise after an IT team has implemented a full solution but overlooked necessities in solution design, or after the organization has used the platform for some time and been confronted with unexpected costs. As a result, I categorize preventive measures in two general categories that should both be part of upfront strategy planning: 1) Building the platform, and 2) Managing the platform.
Building the platform
When thinking about a new analytics platform, you must consider many elements, such as choosing the right products, deciding on the employees to involve, creating the platform, managing and using the platform, selecting evaluation criteria to measure the performance of the platform, and many more.
Because there are many perspectives to consider during platform development, let’s converge on what you must think about in this stage to prevent unexpected platform costs. The key is to create a complete architecture design and overview of all necessary technical resources so that you can make a reasonable estimation of the costs of the analytics platform. While it’s possible to make a reasonable estimation based on the projected amount of data to go through the platform and the cloud resources you expect to use, inevitable changes and unpredictability might cause some unease. Costs that are higher than expected can have many causes, such as unforeseen circumstances or misestimations.
At first you might think that high costs are a result of not accurately estimating them in your methodology. But making upfront cost calculations is often not simple at all. Calculating costs becomes more difficult with increasing architectural complexity compounded by uncertainties. As I’ve noted, realistic cost estimation starts with designing a complete architecture that includes resources with the required level of performance, the right storage types, and services related to operating and maintaining the platform, such as networking, security, IAM (identity and access management), and monitoring, which often result in hidden costs. Also keep in mind that the cost of services can vary among locations, so all geographical areas where your resources will be deployed need to be considered. The key message here is to make sure to base your initial cost estimation on an architectural design that considers all the requirements you have for the platform.
In the process of selecting products and services, having a clear idea of how you will use the analytics platform is essential. This can be difficult to forecast as you might have to make predictions related to the company’s business. Important information to obtain here concerns the amount of data that is processed, the storage of data over specific time periods, the movement of data in the platform, and the activities that will be performed to ingest, process, transform, and query data. All these variables vary with changing data analytics goals, which can be triggered by company developments that cause new analytics needs.
For example, the requirements for data storage locations can change due to geographical expansion of the company, or the amount of data stored and analyzed might increase significantly because of the acquisition of a large customer or a merger with another company. Because it can be challenging to predict such business events, you can make this process easier by planning to utilize the platform for a specific use case. A specific use case allows you to have a clear idea of why, how, where, when, and how long the analytics platform is to be used, such as, for example, processing historical data to make predictions. Otherwise, you might need to accept that your initial cost estimation is more of an educated guess than a number you can commit to for making cost decisions.
A great way to design a complete architecture with thought given to all related considerations is by working with a proof-of-concept (POC). A POC is a method to test and evaluate the solution you want to build but without incurring high costs or affecting the business. The idea is to build the analytics platform and use it for a very specific use case with a small amount of data. This can enable you to encounter unforeseen challenges or discover new practices in the process of building and using the solution. It’s a great way to minimize cost estimation miscalculations while providing evidence that the new solution meets the business requirements.
Managing the platform
Another part of the strategic undertaking to design, create, and operate the analytics platform relates to keeping costs within boundaries during its operation. This entails security, data lifecycle management, optimizing performance, monitoring platform events, and dealing with unforeseen causalities. Discussing this is too much for one article, and so I elaborate on some considerations for preventing operating costs from being higher than expected. When bringing your analytics platform to production, it is critical to have the right configurations and behaviors in place for platform management, such as data accessibility, data lifecycle management, resource performance monitoring, and alerts indicating risky developments.
First, an effective data governance strategy is indispensable for preventing unexpected costs. Data governance refers to making sure that there are controls in place around the data, its content, structure, use, and safety. This means that you must understand the type of data on the platform, the quality of the data, the usability of the data, who’s accessing it, who’s using it, what they’re using it for, and if the use cases are in line with expected use. While these practices are relevant for data security in general, they are also important for preventing unnecessary costs because unexpected or incorrect use of the platform might have an adverse impact. Defining responsibilities, roles, policies, and rules helps prevent unexpected use of the platform. In defining such practices, you can use different perspectives.
For example, imagine that an employee performs an expensive query on the data in the platform while not knowing that using it is restricted to specific use cases and specific people inside the organization. This is a situation calling for identity and access management (IAM), which refers to practices concerning platform accessibility and usability. Making sure that the access privileges of platform users are safe and effective is an often-overlooked part of platform management. It’s generally advisable to set roles according to the principle of least privilege, which means that users are granted access to perform only the activities necessary for their job. In addition, it’s important to keep track of access privileges and change them when a user’s purpose or task changes.
Another aspect of data governance is data lifecycle management (DLM), which refers to putting practices and rules in place for data that enters the analytics platform. The purpose here is to ensure the safe, effective, and compliant use of the data throughout its lifecycle. This means, for example, deleting data from the platform that is not useful anymore, storing data in the right storage buckets to minimize costs, cleaning data to filter out what’s unnecessary, and categorizing data for its usage purposes. Implementing such practices prevents, for instance, a situation in which an unnecessary — and costly — amount of data is stored and processed in the platform.
Last, but certainly not least, monitoring the platform and optimizing its performance are fundamental to preventing unnecessary costs. Monitoring refers to keeping track of platform health by collecting, analyzing, and acting on data that is transmitted by resources used in the analytics platform. This data consists of either metrics, which describe some aspects of the service at a particular point in time, or logs, which contain different types of data organized into records, such as events. Monitoring this information indicates how the platform is performing and allows you to proactively identify any resulting issues.
Part of monitoring is performance optimization, or in other words, the actions you take to help platform resources perform well at an optimal cost. While the practices to optimize performance depend on the resources you are using, here’s a general explanation: Some decisions made beforehand implicate costs, such as whether to over-provision resources to deal with peaks in demand or to increase compute size when peaks are rising. In general, it’s beneficial to tune resources based on demand as this will probably result in a lower monthly bill; however, it depends on your situation.
Next to making upfront decisions, you must also monitor any unwanted behavior in the platform, such as costs that are increasing too fast for some resources, as well as unexpected peaks in resource usage. This can be done by setting alerts for these types of behaviors. It’s important not to forget to assign clear responsibilities for who is to monitor these alerts and create a clear plan of action for when these challenges arise.
Summary
If you feel anxious about putting a new IT solution in place due to cost uncertainties, hopefully this article has helped you feel more at ease by showing what you can do to minimize unexpected costs around a new cloud analytics platform. In general, I demonstrate two broad perspectives for the strategic planning phase.
First, in planning to build the platform, it’s important to create a complete architectural view of all the products and services you need. To achieve it, clarify platform requirements, data analytics goals, and employees’ platform usage behavior. Performing a proof-of-concept is useful to fully understand the elements of the architecture and usage of the platform.
Second, think about how to manage the platform when it’s put into production to help you identify best practices for preventing unexpected costs. By creating an effective data governance strategy, prioritizing identity and access management, practicing data lifecycle management, establishing platform monitoring and performance optimization practices, and creating action plans with clear responsibilities on alerts for unwanted behavior, you can prevent many unexpected scenarios of rising costs.
Marjam Bahari is on LinkedIn.