Building a Modern Data Stack for the Modern Business

Tenny Modiselle
Slalom Data & AI
Published in
9 min readMar 9, 2022
Photo by Annie Spratt on Unsplash

Over the last two years, the COVID-19 pandemic has forced many businesses to reassess decades-old traditional business models or face closing permanently. While many shoppers turned to digital commerce to avoid exposure, others made use of new or expanded pickup and delivery options from local brick-and-mortar stores, thanks in part to the plethora of third-party delivery providers. This rapid change in consumer buying behavior, shifting customer preferences, and management of inventory and supply chain have created both challenges and huge opportunities for retailers.

With the seismic shift to the virtual marketplace, business leaders have reinforced strategies for collecting, managing, interpreting, and leveraging data as a business asset, and for getting the most value out of that data. For many, the phrase “data is the new oil” has again become popular to highlight the importance of consumer data. Harnessing the power of data by investing in a modern data stack is the best way for organizations to drive business resiliency, gain customer insight, and deliver exceptional customer experiences in today’s world.

Retail, ecommerce, and consumer data

While omnichannel has been on the rise in the last decade, the surge in online purchasing driven by the pandemic has further emphasized the importance of ecommerce as an effective business strategy. Customer registration information, payment methods, delivery addresses, browser cookies, and even customer sentiments from online reviews and social media have provided retailers with an unprecedented treasure trove of data — and this is just the tip of the iceberg when it comes to the amount of data being generated. “Big data” has suddenly become more than just a buzzword. It’s now an essential business asset and key differentiator for market dominance, providing organizations with the potential to create unified customer views and a holistic analysis of the business performance.

Retail data analysts and data scientists are now focused on the plethora of data available to understand and build personalized experiences for customers. Even the smallest companies have jumped on the opportunity to build technology stacks to funnel customer data into a central location, transform it into actionable data, create a plan for acting on the data, and then adjust their processes accordingly. At the same time, companies want to achieve this at a low cost, with low or no-code tools that allow for nimble provisioning of insights and improved accessibility of data across the organization. Enter: the modern data stack.

The modern data stack

Legacy data infrastructure has existed for decades. However, the evolution of data has uncovered problems that limit the potential benefit of these platforms. The term “modern data stack” generally refers to the best-of-breed cloud technologies that are modular in capabilities, but collectively provide a unified data platform that can process massive volumes of data with ease.

The rationale for adopting the modern data stack is comparable to the recent pay-per-trip model of car-sharing that has been popularized over the last decade by companies such as Zipcar.

Like buying a car, legacy or on-premises data infrastructure incurs upfront costs and long-term maintenance investments, while the pay-per-trip model of serverless cloud platforms allows businesses to adapt nimbly to changing needs over time.

The components of the modern data stack are easy to set up, with no upfront cost and a pay-as-you-go pricing model that also allows for plug and play. Each component can be optimized for ease of data collection with a no-code approach, for centralization of work in the data warehouse and for speed to actionable insights with limited need for complex engineering. The intention of combining distinct tools is for each technology to have a singularly defined role in the stack and to perform it well, allowing organizations to swap parts of the stack as business needs evolve and avoiding vendor lock-ins. These components are designed to easily accommodate growing data demands without costly, lengthy delivery windows and to also promote self-service for nontechnical employees.

Below, we outline the four main components of a modern data stack — a data warehouse or lake for cloud storage, data pipelines for data integration, a transformation engine for cleaning up the raw data, and an analytics platform for visualizations and access to insights that will drive operations and AI capabilities — to demonstrate how organizations can get started and what features to consider when assembling a modern data stack.

Join data leaders from Slalom, Fivetran, Snowflake, and Dataiku on March 24 for a fireside chat as they discuss the impact a cloud-optimized modern data stack can have on your business.

Data warehouses

Each component of the modern data stack has seen numerous data tools emerge, with each becoming highly specialized in its portion of the data lifecycle. As companies strive to become more data-driven and customer-centric, the first hurdle is to decide where data is stored. A data warehouse or data lake has become the de facto hub for data storage. Investing in a modern-day data warehouse, built in the cloud and for the cloud, offers companies multiple prospects to propel insights at speed while ensuring integrity and consistency.

Cloud data platforms such as Snowflake have come to dominate the data storage component of the stack by providing customers with a zero-maintenance, minimal administration solution that performs at scale. The platform can also receive data in any format — structured, semi-structured, and unstructured. To differentiate itself, Snowflake offers complete serverless, shared data, and decoupled compute architecture that allows customers to only pay for what they use. Snowflake Data Marketplace also creates monetization opportunities for data vendors and data service providers.

For data consumers, Snowflake provides seamless access to readily available data sets that can improve strategic decision-making without the need for data movement. Retail data analysts and data scientists now have the capability to access and combine published data with their internal sales data with only a few clicks and no code required. Available data sets such as COVID-19 infections by location, household demographics, population density, and more are made available to provide accurate information that enables strategic choices leading to efficient operations and sustainable growth. Selecting a data warehouse that offers nearly infinite expansion, data democratization opportunities, and scalability optimized for cloud operation is the first step to building a modern data stack that will enable you to transform your business.

Data pipelines

Today, the speed at which relevant data is identified and collected has become one of the leading factors in reducing the engineering costs. With a fully managed data ingestion tool that automatically integrates with your chosen cloud data warehouse, the modern data stack saves substantial time and money by eliminating the need to develop data pipelines that ingest the data into the data warehouse from scratch. What was before a tedious task of developing, configuring, and implementing data collection programs is now solved with data connectors that enable data collection with a pay-per-use pricing model.

Technologies such as Fivetran are outstanding at providing an array of pre-built connectors to hundreds of SaaS and on-prem data sources and operational databases — extracting, loading, and transforming data with only a few clicks. Fivetran provides reliable and robust data pipelines to collect and funnel data into the data warehouse with the necessary security, privacy, and governance applied along the way — all with zero coding. All this is offered using a fully managed, serverless platform for data collection. The business user can now get data in the format and frequency of choice from ecommerce, SaaS, external, and internal applications and make it available in the data warehouse in no time using the predefined connectors that support real-time or batch collection.

Transformation engines

Once data has been collected and stored, it’s time to analyze and activate it. At this point, retail analysts use generally known SQL queries to combine and interpret the data and attempt to establish repeatable rules to standardize the data and prepare for insights. With the growth of more complex, semi-structured data formats, analysts rely more on the capabilities of the data warehouse platform such as Snowflake to translate the data into more readable and usable formats. Data teams work natively on data inside the warehouse for faster productivity, increased scalability, better performance, and cost savings.

This transformation layer may be a simple implementation leveraging native SQL for handling data or an elaborate set of custom processes coupled with orchestration engines. For businesses that intend to perform quick experimentation with the data for nimble insights, data analysts may use the former approach initially and, when necessary, involve data engineers as complexity of the transformations increases.

Analytics platforms

With the needed data available, data analysts can understand what has happened up to this point and begin to predict what may happen in the future. They can begin evaluating how individuals shop, what products they buy, and establish a shopping pattern for the customers. From there, the business can begin formulating ideas on, for example, what changes to make to their product range or pricing strategy and how to create personalized shopping experiences. Over time, the growing demand and importance of analytics platforms has seen many tools come to market, including open-source tools — such as Python — which don’t require much coding or learning curve to deliver great results.

Businesses also use artificial intelligence (AI) to provide a clearer understanding of customers and begin building the foundation for predictive analytics. With well-curated data sets and robust processes for ingesting new data, AI allows teams to identify trends in historical data, predict future customer behavior, and operationalize the subsequent AI outcomes. Tools such as Dataiku allow data analysts and business users to deliver advanced analytics that leverage data and machine learning with minimal coding needed. Dataiku Online is a managed service that facilitates data preparation for machine learning, building the models, and pushing the outcomes back into SaaS tools to make the insights actionable. Capabilities such as these bring agility to the businesses and drive optimization across the business, from supply chains to customer churn predictions/prevention. It’s the analytics component of the modern data stack that allows data teams to deliver advanced analytics and data-driven decision-making across teams so organizations can respond quickly to rapidly changing market conditions.

Bringing it all together

Data, tools, and technology are nothing without people. Slalom’s modern culture of data framework helps organizations empower people to uncover insights from data to inform every decision. With just a few components of the modern data stack, teams can quickly and easily build pipelines to collect data from varied sources, store it in a platform that allows for high performance and scalability, and begin analyzing and predicting customer interactions to personalize experiences for better business outcomes. These four components — a data warehouse, data pipeline, transformation engine, and analytics platform — paired with a culture of experimentation and innovation, can be implemented to find solutions to supply chain issues caused by the shrinking inventory and choking logistics that many retailers experienced during the pandemic. Dynamic teams, with limited handholding from IT, can leverage market and customer data and consumer sentiments from social media — along with data hidden across supply chains — to drive profitable revenue growth and great customer experiences.

Conclusion

The modern data stack has become a key player in the growing customer service channels built around upsell, cross-sell, and customer retention. As customers continue to navigate the pandemic, businesses must keep up with their needs and shifting behavior. Retailers that invested in the modern data stack to advance their analytics capabilities have been able to navigate the challenges of the pandemic era with greater resiliency than their counterparts who stayed with traditional business models. These businesses were better equipped to understand the changing customer needs more quickly and service those customers on their terms using insights from personalization to earn long-term loyalty.

The modern data stack has proven to adapt to disruptive market moves and has become a platform for experimentation with data to discover growth opportunities. The flexibility and robustness of the modern data stack provide an opportunity to build programs that improve business efficiency and make data and speed the heart of the operating model. The ability to analyze a vast variety and volume of data coming into the enterprise at unmatched scale will differentiate businesses from their competitors.

Register now for the fireside chat on March 24 to learn more about how to unleash the power of your own data.

Slalom is a global consulting firm focused on strategy, technology, and business transformation. Learn more and reach out today.

--

--

Tenny Modiselle
Slalom Data & AI

Surrounded by data — starved for insights? Building and shaping a better future for our global clients with Data & Technology at Slalom.