Four building blocks for scaling insights — Part 3: Best development practises

Bjørnar Skåre
Oda Product & Tech
Published in
7 min readOct 22, 2020

Identifying and following the best development practises is key in maintaining structure in a scale-up. Here is how we have implemented this, aiming to balance structure and productivity. This is the third part in our blog series “Four building blocks for scaling insights”. You might want to read the first two parts, “The embedded model” and “The evolution of our insight infrastructure” before you continue.

The Data Platform team

Having data analysts and data scientists in an embedded model is beneficial in the sense that they are closer to our stakeholders and the problem solving. It does however pose a risk of the different teams performing similar tasks differently within our tech stack. Decisions on how to handle historical data, how to perform data testing, and where to handle business logic are all similar issues from a data perspective. Core data, such as order and user information, are also consumed by the different teams which require both coordination and collaboration.

Our goal is to have embedded teams that are able to solve problems end-to-end — but does that mean that we need to accept duplicate work, multiple sources of truth, and varying data quality in the different teams?

To answer these questions we need to look to our central Data Platform team, that Nina introduced in the first part of the blog series. Earlier, the responsibility of the Data Platform team was to support all teams in the organisation that didn’t have embedded team members. As Kolonial.no grew, more teams required embedded analysts and scientists. The Data Platform team was less involved in problem solving processes, and more involved in training, support and coordination of the embedded team members.

The shift of responsibility led to a need for new skill sets for the Data Platform team as shown in the figure below. Amongst the new roles required was an analytics engineer — who turned out to be me. The analytics engineer role is one of the most recent of the data roles, having emerged only in the last two or so years (according to the Data Council blog). Amongst our more traditional data roles, the analytics engineer is working primarily as a link between the data engineers and the data analysts. As an analytics engineer, my main responsibility is to make sure we utilise our tools and technologies in the best way possible. This involves deciding on best practises in cooperation with other team members, and providing the guidelines and tools necessary so everyone easily can follow these best practices.

A Data Platform team is necessary to coordinate and facilitate collaboration between the embedded teams. Special competencies such as machine learning engineers (ME), analytics engineers (AE) and data engineers (DE) are also part of the team.

Reduce friction

The Data Platform team consists of several different roles with more specialised competencies. We have stated that our mission is to empower data analysts and data scientists to efficiently create value from data.

Our mission is to empower Data Analysts and Data Scientists to efficiently create value from data

To achieve this, we believe that we need to reduce friction as much as possible and allow the embedded teams to work on the business logic. There are many processes that could be time-consuming, such as:

  • Acquiring and maintaining developer and test environments
  • Writing documentation and tests
  • Handling deployment, monitoring, and maintenance

To make these processes time-efficient, we choose technologies and tools that are suitable for our use cases. As Anders mentioned in our previous blog post, one does not want to invest too heavily in technologies. The Data Platform team is responsible for knowing our variety of challenges, and ensure that we have the appropriate tools and technologies to meet those needs.

Whenever a data analyst or a data scientist is implementing solutions, their journey through our tech stack could be described as a game of snakes and ladders. Our Data Platform team is responsible for adding as many ladders and removing as many snakes as possible. A ladder could be that all tests include a set of instructions on how to debug them. On the other hand, a snake could be that you have spent a lot of time on a feature and discover that the feature exists already.

The journey of an embedded team member in our tech stack should be described as a game of snakes and ladders, with lots of ladders and few snakes.

The following sections describe our key principles at Kolonial.no to reduce friction through our development practises.

1. Common guidelines

Common guidelines are essential when synchronising how the embedded teams are working in our tech stack. The guidelines can be reviewed and revised by embedded team members, but should be owned by the Data Platform team.

Guidelines such as naming standards and project structures are important, but it is also key to acknowledge the extra complexity of having an embedded model. Transformation of data might not be an everyday process of a data analyst. In fact, a significant part of an data analyst’s or a data scientist’s work is to understand the domain from a stakeholder point of view. This means that it is wise to provide process-related guidelines rather than a dictionary of how we perform specific operations.

However, when working in a modern tech stack, new features and knowledge will often be available to simplify or improve best-practises. We should always challenge our current best-practise the same way we challenge our current technologies.

2. Support

Great guidelines will not be enough to ensure best-practice. Every once in a while (or perhaps more often than that) we encounter problems where our guidelines don’t suffice. Even though our goal for the embedded teams is to be able to solve their problems end-to-end, it is important that they have someone to turn to when they encounter edge-cases or complex problems.

The Data Platform team should not only provide guidelines and infrastructure, but also have the capacity and flexibility to support embedded teams when needed. In order for the embedded teams to work both efficiently and according to the best practises, the Data Platform team must have sufficient room for supporting them.

3. Automated testing

Setting best-practises through guidelines and support provide the foundation. But in a hectic and flexible environment, we can’t require the team to stay up-to-date of all best-practises. Efficient end-to-end problem solving requires continuous integration and deployment. If our embedded team members are to be truly self-served, they also need to be able to deploy changes themselves. With many contributors to our source code, automatic testing is key to ensure that changes won’t cause unintended behaviour.

Tests are necessary, but could also be time-consuming and cause a lot of friction if not properly implemented. In Kolonial.no we implement tests using some key principles:

  • Trust: trusting your tests are important — avoiding false positives and undetected errors
  • Low maintenance: designing tests in ways that will avoid much maintenance
  • Efficiency: limit your test data in a smart manner whilst keeping them valuable
  • User-friendly: design tests that are easy to debug and fix

The Data Platform team works on designing tests that could be applied across all embedded teams, while the teams themselves should test their own business logic.

4. Peer reviewing

Even though automatic testing identifies a lot of the errors, one will never achieve a 100% test coverage. So, as a final step of our quality assurance, we strive to do peer reviewing on all changes. Ideally, if we have a great set of guidelines and tools, the code should be structured, documented, and tested to our standards at the time of making a pull-request. Peer reviewing is an extra safety net to ensure that the guidelines have been followed.

In addition, it is always a good idea to review the business logic that has been added or changed. This could inspire and lead to great discussions that might increase the quality of data and end result. It is also a great way of getting feedback on your work and to share knowledge across the team.

Coming up: Training & Support

Securing best development practises allows us to add more structured and useful data to our decision makers. However, providing data doesn’t necessarily mean that the decision makers are able to draw insight and make good decisions. In our final post, we will describe how we work with training and support to ensure that Kolonial.no is successfully data-driven.

We are hiring!

Please check our open positions if you would like to join an awesome team on a mission to create freedom and flow in the every day lives of our customers. Read more about what we are doing here and feel free to reach out if you have any questions!

--

--