Transforming Digital Data at

Going from zero to data hero in 100 days

Photo by Angie Harms / Creative Commons

We were given the following task: make sense of digital data in Sage. Fixed budget and 100 days. Let me share what went right, what went wrong and how we managed to succeed.

Spaghetti junction

Imagine a company that owns 10,000 websites, 20+ mobile apps and 1,200+ analytical accounts. Managed by a number of independent teams that usually don’t talk to each other.

10,000 live websites owned

Understand Marketing cloud

Sage as many companies created its own marketing cloud. It happened by picking the best in class solution for each problem. This looks like a good idea until you will try to connect systems together.

Sage custom marketing cloud

Individual teams had skills and understanding one or two solutions (vertical) but lacked understanding and skills across the entire portfolio solutions (horizontal). For example, media teams were good at media but not necessarily at the CRM and vice versa.

User journey data was sitting across multiple platforms with very little interconnectivity at best.

This challenge was amplified by technology vendors itself. Each provider is pushing its own ecosystem. At the same time, they are not very interested in connecting with competitor’s products. This is why you can see the most successful companies switching to a complete stack by a single vendor: Adobe marketing cloud, Google Analytics 360 suite, SalesForce marketing cloud to name a few.

Create strategic data layer

As for many enterprise businesses, replacing existing platforms was out of the question. The impact on short-term revenue and business as usual (BAU) would be too high. We had to find another solution.

Data layer to the rescue! Don’t change the system but change input and output instead. In short, unify data independently of the platforms.

Data layer had obvious advantages for the marketing and data science teams. It allows modelling the data in a much closer way to the customer behaviour without impacting IT.

In marketing area, instead of simplistic groups, each individual could be a member of multiple groups, interested in multiple products or services with different propensity to buy. Data layer also allowed to create models where each user move from one group to another.

Separate apps and data

Data layer allowed to detach information from the applications. It also allowed the creative and agile approach to data collection and processing.

This was done by creating groups of attributes, taxonomies, triggers for data collection and processing — with minimal changes to the core platforms (for example SalesForce). No reliance (or minimal) on the internal IT team and minimum disruption to the business.

How to win adoption? Data vs. analytics vs. reporting

The success of the project depended on the adoption.
Different stakeholders required a different view of the data.They also consume data in a different way. We were inspired by the HBR approach of a single source but multiple versions of truth.

Combining existing BI tools with more specialised applications (Google stack)

Business leaders were looking for daily and weekly strategic performance indications like revenue while optimisation managers required access to near real time-data.

Define KPIs

Key to the success was to focus on each audience and understand what will drive impact for that group.
The first step was to map data and expected outcomes. In some cases it was in the form of KPIs in other it was set of triggers and automation.

Things to consider: complexity, depth of information, speed, required training.

What’s under the hood?

This can be described as an iceberg — what is visible to internal customers versus data pipelines and technology that delivers the information. Input — output model is quite useful to illustrate high level data flows.

Create data pipeline

From epics to stories and sprints. We have used conceptual phase to help communicate data flows and data pipelines before implementation work began. The diagram below shows high-level data pipeline — with no specific technologies being used. It helped to find the best tech solutions.

Considering the timeframe of 100 days and all the requirements we have decided to go with Google Cloud Platform. We became one of the first users of data loss prevention API which allowed us to provide real-time compliance at scale.

Ability to immediately scale up, native integrations and connectivity within Google Cloud were very a huge winners.

Immediate results

What matters to business users are results that can be seen and felt. To win hearts and minds we started prototyping dashboards and reports parallel to the work on the data pipelines. Integrations within GCP helped us to accelerate this part of the project

Reports, dashboards. Democratised data and automation

Data Studio connects natively to the rest of Google stack (Big Query, Google Analytics 360 etc). While not as powerful as Tableau or Qlik it wins where the connectivity, agility and the speed are the key.

DataLab and Colaboratory were another two native Google tools we used for at scale data analysis. Again, both natively connects to data pipelines and data sources created within GCP.

Lessons for the future

  • Data is bigger than analytics. Get data right first. Analytics will naturally follow,
  • Consider mental model used for data analytics — probabilistic approach vs. counting buttons. Some serious stakeholder education may be required,
  • Stick with agile.
  • Completeness kills. Be aware of diminishing rate of return. You have 80% of data right, getting the next 10% make take a year and $$$. Is it worth it?