Why Data Platform?

Lauri Koobas
Bondora Engineering and Data
5 min readOct 28, 2022

This article is about the “why” of data. It is valuable, but there is too much hype and air in that bubble. I’ve pondered the importance of data for a long time, and after some helpful discussions with awesome people, I feel like I have now arrived at some applicable mental models.

This is the vision I propose:

Important decisions are informed by data in a timely manner.

It’s a description of the ideal state of the world, something we will always strive for. I believe the components to be both necessary and sufficient. Let’s start from the end.

“.. timely manner” — I believe this is THE most important part here as it implies why the data platform exists. Taking a step back — the act of analyzing data doesn’t require much engineering, primarily only access and time. Operational databases contain the data, the meaning of which can be found in the heads of developers or a code repository. Applying intelligent people for a while guarantees that sufficiently critical analysis can and will be completed, even if it takes a team and weeks or months.

What if we successfully do such an exercise and decide to expand that capacity across the organization? The obvious answer is to hire more analysts, as more analysts equal more data-informed decisions. It is technically correct, but that’s just half of it. The other half is often missed — each decision still takes about the same time and effort. Nonetheless, the benefits are clear, and the analytics function is scaled up. Precisely this approach led to the creation of the data warehouse in the first place several decades ago — more analysts wrote more analytic queries, which in turn started to disturb the operational databases. Engineering was applied, a data warehouse was created, analysts used it instead of operational databases, data informed more decisions, and companies that did it succeeded more. Evolution.

We are in the same situation right now, albeit using somewhat different terminology on a slightly different scale. Higher analytics need calls for better data access, tools, etc. The solution is coming in through all possible channels — whichever question or technical setup we might have, a super friendly/aggressive SaaS provider is ready to jump in. It used to be Oracle and Microsoft, and now it’s 400 different startups.

There is a big problem, though — with so many tools and analysts available, why is the impact not proportional to the effort? How can it then be that most everyone still struggles with being data-informed?

The complexity lies within the speed of change itself. An efficient organization can cycle through one major bottleneck per quarter, from sales to customer support, engineering, design, to sales again, etc. In this situation, having an army of analysts won’t help much if every question takes two weeks to answer. Neither will a large set of tools, for that matter — the environment will change too fast to maintain a sizeable interconnected ecosystem. That’s where the data platform comes in — it’s the only way to reduce the time-to-insight. It’s not, of course, enough to rename some piece of the organization as a “data platform.” The necessary components are a proactive approach, writing down and following principles, relevant communication, planning, and more.

What emerges is not isolated sub-specialization but the need for a large part of the organization to be involved. Otherwise, even if there is a super good data platform team with the perfect architecture and tools, they will still be catching up when they get surprised by a new SaaS tool or a product feature. It will take time to gain access, explore, import, understand, and model the data. Same with changing reporting requirements or something as mundane and predictable as a new set of OKRs. These all create delays and frustration unless the information somehow reaches the relevant parties in time.

Finally, returning to the rest of the vision — about important decisions. None of the platform or analytics is free, and it doesn’t make sense to overdo it. The way to do more is where the specialized tools come in that streamline one or another part of analytics for cheaper cost per (possible) decision. A/B testing frameworks and customer journey analytics tools have a niche if used appropriately.

It’s an overwhelming amount of stuff to understand and get right as part of everyday work. That’s why the data platform team needs to do some planning.

At around the team size of three people, we can do some planning. Think about roadmaps, frameworks, features, and costs. At this point, whoever goes on vacation can leave the laptop in the office. Most of the work is still making data available and timely, but some time can be spent on actual forward-looking planning.

Three is also a good number where there is some variety in backgrounds, experiences, and interests, giving a higher chance of covering more relevant aspects of decisions. Just to set the expectations, though, it would be good if, within the group of three, one person doesn’t hate working on the things mentioned above. The capacity to plan is good; taking the time for it is better, and coordinating it with the rest of the organization is best.

The skills layout of the team so far:

  • first hire — great data-related generalist, excellent communication skills
  • second hire — good data-related generalist, good communication skills
  • third hire — good data-related generalist, good communication skills, good planning/process skills

From here on, it gets more specific. As touched upon before, analytics is useful when it’s timely. It means different things in different domains, so it follows that the skills required are somewhat more specialized. One worry might be that — what if an area gets “solved” and that person with those narrow technical skills is no longer needed? It depends on how specialized we are talking about here. If it’s a sub-area, like A/B testing or machine learning engineering, then it’s general enough that the reverse is probably true — as teams discover how valuable these things can be when done well, then the demand will likely outpace the supply.

On the other hand, overly narrow specialization might be a concern if a single tool needs a single expert user for the foreseeable future. In that case, it might make sense to outsource the skill for some time while gaining basic proficiency within the team. Once again — communication and planning are highly encouraged :)

— —

We are hiring one of the aforementioned specialized positions — Machine Learning Engineer.

--

--