Scout24 Data Landscape Manifesto

Presenting the manifesto

To celebrate the two-year anniversary of the Scout24 Data Landscape Manifesto, we would like to formally present it to the world in this blog post.

Screenshot from our internal wiki showing the original date. (Thanks Arif for that first version!)

Over the last two years we’ve used the manifesto to drive Scout24’s internal effort to empower all employees to assume responsibility for their data and to become more data-driven. We’ve discussed it far and wide, including in conference presentations in Germany, Norway, Spain, the UK, the USA and Australia but we’ve never put ink to paper and made it publicly available in written form.

The story behind the manifesto

As a tech team, Scout24 Data Engineering started where most tech teams start: with the tech. We built a modern cloud-based data lake that would scale indefinitely along with the company.

However, we initially failed to realise that our goal is for the company to become truly data-driven, not only to have a modern cloud-based data lake. In fact, the technology is the easier part. The hard part is changing the company culture to go along with the technological change.

Becoming data-driven requires technical, organisational, and, most importantly, cultural change.

From this disappointment and soul-searching came a vision, not of the technical architecture, but about how the company should view data and how the data team could enable this change of values.

Consequently, together with the rest of the Data & Analytics team we wrote the Scout24 Data Landscape Manifesto to articulate this vision. It represents the team’s strong opinion of the roles, responsibilities and values necessary for building a data-driven company at scale.

The manifesto, which follows, consists of seven principles each with an introductory statement.

The Scout24 Data Landscape Manifesto

Roles, responsibilities and values for a data-driven company at scale.

Principle #1 (Preamble)

We believe that collecting and analysing data is crucial to understand our business, our customers and the market in order to provide the right services and products.

#1 Data is a key asset of our company.

Principle #2

We therefore believe that everyone in the company must have easy access to the data available and it must be easy to publish data which can be used by others. This requires a solid Data Platform: easy-to-use tools, reliable infrastructure, and simple guidelines for publishing and consuming data in a secure and privacy-aware way.

#2 We, Data & Analytics, are responsible for providing the Data Platform and we provide support and training for it.

Principle #3

We believe that exhaustive centralised data management does not allow us to scale to the level of data creation and consumption we aspire as a company, because it creates a bottleneck and introduces accidental, indirect dependencies. Instead, we believe that data autonomy is the only way for data usage to scale across the company. However, for data autonomy to not become data anarchy, there has to be a clear set of basic rules and responsibilities.

#3 Data autonomy puts data producers and data consumers in control of their data and of their metrics and thereby allows us to be data-driven at scale, but this comes with responsibility.

Principle #4

We believe that extensive data availability, data discoverability, and data usability are crucial and that — at scale — no one else can ensure this other than the one controlling the source where the data is originally generated.

#4 Data producers are responsible for publishing data to the central Data Lake, for the data’s quality, and for publishing metadata that makes it easy to find and consume the data.

Principle #5

We believe that the stakeholder of a metric has to be the single owner of that metric and its definition and has to drive its implementation. Without a single source of truth about what a metric means, we risk that multiple diverging and possibly contradicting understandings and implementations develop over time.

#5 Data consumers are responsible for the definition and visualisation of metrics and for driving the implementation and maintenance of these metrics.

Principle #6

We believe that a minimum level of company-wide comparability and reliability of core KPIs is crucial for leading the company into the right direction and can only be achieved with coherent core data. Our executive leadership team is the owner of these core KPIs and the data group represents the executive leadership team in terms of metric ownership.

#6 We, Data & Analytics, take the full ownership and responsibility of the company-wide core KPIs and making the coherent core data entities from which they are derived accessible.

Principle #7

We believe that transparency is crucial for understanding what the meaning of a metric is. If month-to-month comparability must never break, there is no way to continuously improve metrics and their transparency based on new insights.

#7 We value data transparency over data continuity, which means we may break metric comparability if it is for the cause of enabling better insights.

The Ultimate Goal

Ultimately, we believe that together these principles will lay the foundation for an inclusive and highly scalable data landscape.

We wish to build a federal landscape of data producers and consumers with just enough rules to ensure seamless co-operation without severely impeding autonomy.

Photo by Stephen Dawson on Unsplash

Two years in: is it a success or failure?

It is now two years since we introduced the manifesto at Scout24 and began using it to guide the development of our data platform. So far the reception has been mixed. Some teams welcomed the autonomy it provides, while other teams objected to the new responsibility.

To switch from being a passive to active participant in the data landscape of a company is a difficult cultural change. It requires news skills and knowledge, but mostly it requires a shift in mindset: no longer will data and insights be delivered in a nice, clean package. Instead, each employee is expected to generate their own insights based on basic data literacy. Although this sounds like a burden, we believe it is liberating. Who knows more about your business problems than you? You are in the best position to use data to answer the questions that you want answered.

We still believe focusing our efforts on building a self-service data platform instead of data preparation and analysis leaves our internal users in control and empowered to make fast data-driven decisions. However, so far this argument has not been completely convincing, especially because some teams lack the technical background or engineering resources to properly assume responsibility for their data. In these situations we take a practical stance and assume surrogate responsibility until the team has developed the skills internally. In addition, we are developing training materials and improving the usability of our tools to lower the bar of technical knowledge.

Despite the non-universal acceptance, the manifesto successfully set the tone for our evolution into a true data-driven company. The data processes we’ve built not only show that we are able to scale our data analytics activities but also set the foundation for our next journey towards exploiting AI and ML across all our product offerings.

Acknowledgements

Sincere thanks to the Scout24 Data & Analytics team members involved in the original writing of the manifesto. Although it was a group effort, Arif Wider from ThoughtWorks and Sebastian Herold, now at Zalando Tech, deserve special thanks for their initial contributions.

Also thanks to all those at Scout24 who pushed the organisation to understand the benefits of distributed data responsibility. We know it wasn’t always easy but the long-term benefits of a scalable data culture will continue to bare fruit for many years to come.