4 Years of Data Science at Schibsted Media Group

Alex Svanevik
20 min readMar 26, 2018

--

In 2014, I joined a small team at Schibsted Media Group as the 6th Data Scientist in the organisation. Since then, I’ve worked on many data science initiatives in an organisation that now houses 40+ Data Scientists. In this post, I’ll go through some of the things I’ve learned over the last four years — first as Data Scientist and then as Data Science Manager.

This post follows the example of Robert Chang and his excellent “Doing Data Science at Twitter” — an article that I found hugely valuable when I first read it back in 2015. The objective of my own contribution is to provide equally useful reflections for Data Scientists and Data Science Managers around the world.

I’ve divided the post into two sections:

Part I: Data Science in the Real World

Part II: Managing a Data Science Team

Part I is focused on the actual job-to-be-done of a Data Scientist, while Part II discusses how to manage a Data Science team for maximum impact. I’d say both parts are relevant for Scientists and Managers.

I won’t spend a lot of time defining what a Data Scientist (or Data Science) is and isn’t — there are enough articles around the web focusing on this.

A quick intro on Schibsted: Media and marketplaces in 20+ countries around the world. I work mainly with our marketplaces business, where millions of people buy and sell items every day. If you’d like to look at a few actual examples of data science work at Schibsted, here’s a small selection:

With all that said, let’s dive right in!

Part I: Data Science in the Real World

Starting as a Data Scientist in a new company with high ambitions is really exciting, but it can also be intimidating. What do people around me expect of me? What skill level will my peers be at? How should I work in order to be useful for the organization? For a position surrounded by so much hype as Data Scientist, it’s hard not to feel like an impostor at times.

The fear of being perceived as a lightweight often drives the Data Scientist to focus on complexity first. This leads us to the first lesson.

1.1. Complexity is a cost — start simple

They hired a Data Scientist, so surely this problem must be really complex, right?

Don’t be tempted by complexity

This assumption will very often lead you astray as a Data Scientist. First of all, the problems you meet in industry are very often solvable with fairly simple methods. Second, it’s important to remember that complexity is a cost. A complex model will likely entail more work putting it into production, higher risks of mistakes, and more difficulties explaining it to stakeholders. Hence, you should always go for the simplest approach first.

But how do you know if the simplest approach is good enough?

1.2. Always have a baseline

Without a baseline to compare your model’s performance against, your evaluation metrics are likely meaningless. Comparing against random performance is most of the time simply not good enough.

If you don’t draw a line in the sand, you’ll never know what’s really good enough

At one point we built a model to predict the probability that a user would come back to our site — a retention model. Our model had around 15 features based on user behavior, and we achieved a performance of around ~0.8 ROC-AUC. Comparing to random performance (0.5), we were quite happy with this result. But when we stripped the model down to its two most predictive features: recency (days since last visit) and frequency (days visited in the past), we found a simple logistic regression on these two variables gave us 78% ROC-AUC! In other words, we could achieve >97% of the performance by throwing away >85% of the features.

So many times I’ve seen Data Scientists report offline experimental results from complex models without any simple baseline to compare with. Whenever you see this, you should always ask: could we have achieved the same result with a much simpler model?

1.3. Use the data you have

One day I had lunch with a Data Engineer and another Data Scientist. The Scientist had stars in his eyes when talking about all the amazing things he could do “if only he had data on X, Y, or Z”. At one point during the conversation, the Engineer burst out: “You Data Scientists are always talking about what you could do with data you don’t have. What about doing something with the data you have?!”

Greener data over there!

It sounded harsh, but the Engineer expressed an important truth. You’ll never have the perfect dataset, and there will always be data you could be using. In most cases, you’ll be able to do something with what you have.

1.4 Own the data

Related to the point above, data quality and completeness is almost always an issue. But instead of sitting there waiting for someone to hand you data on a silver plate, you need to get out and take ownership of the data you need.

ETL-ing through the valley

I’m not talking about formal ownership in the sense of a data governance model. I mean stretching your role and helping out where you can so that you get the data you need.

This could mean contributing to schemas and formats for data collection. It could mean looking at Javascript code executing in the front-end of a web application to make sure events are triggered when they should. Or it could mean building data pipelines — not expecting the Data Engineers to do everything for you.

1.5. Forget the data

Seemingly contradicting everything I said above, it’s very important not to get too caught up in the data at hand.

Tabula rasa

When a new problem shows up, you should initially try to forget the data. Why is that? Because your existing data can limit the solution space, and it can distract you from finding the best approach. You’ll be stuck in a local optimum where you try to shoehorn every problem into the dataset you have available (exploitation over exploration). As a consequence you’ll never have new datasets.

1.6. Develop a nuanced view of causality

We all know correlation doesn’t imply causation. The problem is, many Data Scientists stop there, and shy away from making causal claims altogether.

The coward’s approach to causality

Why is that a problem? Because Product Managers, the marketing team, your CEO, or whoever you’re working with really don’t care about correlation. They care about causation.

The Product Manager wants to feel confident that when she decides to roll out that new feature, she’ll cause a 10% increase in engagement. The marketing team wants to know that increasing emails from 2 per week to 4 won’t cause people to opt out of the mailing list. And the CEO wants to know that investing in better targeting features will cause an increase in ad revenues.

So is there a middle ground? Turns out there are two.

The most well-known is online experimentation. Essentially you run randomized trials — A/B tests being the most common. The idea is simple: since we’ve randomly selected who’s in the target group and who’s in the control group, if we detect a statistically significant difference between the groups, the treatment we applied is assumed to be the causal factor. Without falling into a philosophical rabbit hole, this is a reasonable assumption in practice.

The less known approach to making causal claims is causal modeling. The idea here is that you make assumptions on the causal structure of the world, then you use observational (non-experimental) data to either test if those assumptions are consistent with the data, or to estimate the strength of the different causal effects. Adam Kelleher has written a great series on Causal Data Science that I recommend reading. Beyond that, the causality bible is Judea Pearl’s Causality.

In my experience, most Data Scientists have extensive experience with building machine learning models and evaluating them offline. Much fewer Data Scientists have experience with online evaluation and experimentation. The explanation is simple: you can download a dataset from Kaggle, train a model, and evaluate it offline in a matter of minutes. Evaluating that model online on the other hand, requires access to the real world. Even if you work at an internet company with millions of users, you often have to jump through many hoops to get a machine-learning model out in front of users.

Now, if few Data Scientists have extensive experience with online evaluation, very few Data Scientists have experience with causal modeling. I think there are many good reasons for why this is the case. One reason is that most causality literature is quite theoretical, with few practical guides on how to get started with causal modeling in the real world. I predict we’ll see more practical guides on causal modeling in the next few years.

Developing a nuanced view of causality means you can give actionable advise to your stakeholders while maintaining your scientific integrity as a Data Scientist.

Part II: Managing a Data Science Team

Schibsted, like many other companies, has two career tracks: “Individual Contributor” and “People Manager”. In a Data Science context, the former is for those who really want to double-down on their Data Science expertise and contribute to the company through hands-on work and technical leadership. The manager track is for those who are more passionate about developing people and leading teams.

I was really unsure which track was right for me, but I ultimately decided to give the Manager track a go. It didn’t take long to realize that this was indeed the right track for me, but I certainly faced a lot of challenges (and I still do!).

The first challenge you’ll meet is that there are very few other Data Science Managers around the world. If you thought experienced Data Scientists were rare, the number of experienced Data Science Managers is a fraction of that. So you’re more or less on your own.

But is managing a Data Science team really so different from managing other types of teams? Yes and no.

If you’ve never managed a team before, surely you’ll benefit from reading management classics like High Output Management by Andrew Grove. In addition, proactively asking more senior managers (from other disciplines) for advice is also crucial.

However, Data Science teams are inherently different in a few key ways, so we’ll focus on lessons specifically related to Data Science teams here.

2.1. A Data Science team is not really a team

When most people think of the teams, they think of something like this:

Més que un club

What are some of the characteristics of a football team like FC Barcelona? At least three things:

  1. A common objective
  2. Different roles within the team, each with different responsibilities
  3. Autonomy in reaching their objective

If you manage a team consisting only of Data Scientists, most likely none of those characteristics apply. Your team will instead have:

  1. Multiple, changing objectives
  2. Specialists, and they’re good at the same thing: data science
  3. Other teams to work with to ultimately have impact on users and revenues

A more suitable team analogy than a football team for a data science team is this:

X-files

The demand for Mulder and Scully’s services is variable over time. They’re brought in when their expertise is needed. And they’ll never solve a case without talking to people outside of the FBI.

Why is this distinction important?

Because if you have a team of Data Scientists, and you manage them as a “classical” team with a common objective, distinct roles, and full autonomy, you will very quickly end up with a frustrated team.

I’ve seen teams of Data Scientists be run as any other Product or Engineering team, and the inevitable consequence is this: the Data Scientists end up doing everything but data science. Instead they end up doing engineering, devops, or product management.

So Data Scientists are different. But how do you then ensure you don’t end up with Data Scientists in an ivory tower?

2.2. Embed Data Scientists in other teams

Magic happens when you put Data Scientists together with Product Managers, Engineers, UX-ers, Marketing, and others.

Basically the objective function you want to maximise is this: fruitful interactions between Data Scientists in your team and people in other teams.

I like to think of this using the concept of a wide channel. Let’s illustrate this using a Product Manager as the counterpart of a Data Scientist.

The worst case is when there’s no channel at all between the Data Scientist and the Product Manager:

No channel between Data Scientist and Product Manager

This means no communication flows between the DS and the PM. In other words, the DS won’t know about any of the product challenges that the PM faces, making it impossible for her to analyze or solve those challenges.

The slightly better scenario is when we have a thin channel between the two:

Thin channel between Data Scientist and Product Manager

In this case, information flows, but it’s typically limited and often asynchronous. Either information goes through other people (e.g. a manager), or through request forms, etc. This type of communication is common when Data Scientists are expected to serve many different stakeholders. But it can be frustrating, because the business context is often not present, and it can lead to misunderstandings and meaningless bouncing back-and-forth.

The most productive setup is when we have a wide channel present:

Wide channel between Data Scientist and Product Manager

In the most literal sense of a wide channel, the Data Scientist sits right next to the Product Manager. This naturally enables them to communicate much more effectively. Having people physically collocated isn’t always convenient or even possible (we’re spread across 22 different countries in Schibsted!), but there are virtual versions of this principle, from Slack to remote pair programming to Hangouts.

Naturally you can’t have every Product Manager in the organization have a wide channel with every Data Scientist in your team — that doesn’t scale. It’s your job as a Data Science Manager to identify which wide channels to establish when. And then get out of the way!

One example from Schibsted where we actively worked on establishing a wide channel was in the development of our Car Valuation Tool, which helps you set a price when selling your car (test it on our Norwegian marketplace Finn). Originally we had a fairly thin channel, of the sort: try to build the most accurate pricing model you can. We found this to be quite inefficient, since there were many product decisions that we couldn’t really answer without experimenting on users early on.

After some time though, we ended up embedding one of our Data Scientists in the product team, with much better results. You can read about some of our early work on the Car Valuation Tool in this blog post.

An example where we had a wide channel from the very start is this predictive model for digital new subscriptions. The model helped increase sales conversion by 540%, and was rewarded with an INMA “Best use of Data Analytics” prize in 2017.

2.3. Take ownership of analytical productivity

In “High Output Management”, Andy Grove states that as manager, you own your team’s output. This means that a Data Science Manager has to invest in creating the best possible environment for her Data Scientists to be productive.

The beauty of productivity

This in many ways, is the counterforce to the embedded model described above. If everyone is embedded all the time, there’s a good chance you end up with data silos and sub-optimal infrastructure implemented multiple times.

Some Engineering Managers claim that when you become Manager, you should stop coding altogether. As Data Science Manager, I think you should spend up to 10% of your time doing hands-on work yourself: training models, visualizing data, etc. This puts you back in the shoes of a Data Scientist.

“I have to spend 15 minutes waiting for this cluster to boot every time I want to do an ad-hoc analysis?! Surely there must be a faster way to do this.”

“This documentation on our schema formats seems to be outdated — how do I measure clicks on this type of button across different sites?”

And so on and so forth.

Of course, this type of hands-on work should not replace proactively getting feedback from your team. But it certainly helps you discover key areas where life can be made easier for your Data Scientists.

You can also be more methodical and use frameworks like Lean Management to aim to eliminate waste in various Data Science processes. This post by the always brilliant XKCD is a starting point:

Is it worth the time?

Just remember that there needs to be quite a lot of flexibility and room for exploration in the work of a Data Scientist. You’re not running a factory!

2.4. Data -> Power -> Politics

It’s important to be aware of the “political” context you operate in as a Data Science Manager — especially in a large and complex organization. Running a Data Science team means you manage scarce and highly demanded resources. This, in turn, means that you will necessarily have to deal with politics occasionally.

Game of Thrones

Some hypothetical examples:

  • A VP is intending to propose a new strategic initiative. She has a 98% completed slide deck, but wants your team’s help to support her proposal with data (… after the conclusion has already been made).
  • A business unit refuses to share data with your team in fear that you’ll discover something in the data that they’re not aware of.
  • A department insists they need Data Scientist support, but when you dig deeper, there’s no real need there, beyond a motivation to increase headcount.
  • Another team with a somewhat overlapping area of responsibility is reluctant to sharing methodology, in fear that you’ll steal their work.

The amount of time you’ll have to spend on situations like these depends largely on the culture of your company, and what incentives exist for people to behave as they do. But it’s always good to be aware that these things occur.

My own naïve belief is that transparency is the strongest medicine. In practice this means leading by example. All meeting notes are open to everyone in the company. All Slack channels are public. All team (and individual!) goals are open for anyone in the company to inspect.

Transparency on its own is not enough though. You have to actively work on building trust with your stakeholders. It takes a long time to build trust, but it can be broken extremely fast!

Now, to what extent should you expose people in your team to politics? I’d say only as much as what is absolutely needed for them to understand the context of their work. That doesn’t mean leaving your people in the dark, but it does mean letting them focus on doing great Data Science.

Don’t let politics get too much of your attention. But keep in mind that when you have access to data, and resources to get value from it, you immediately have power. And politics will always surround power.

2.5. Leverage your resources, aim for high ROI

So many companies are hiring Data Scientists these days. In many cases, these companies basically have no idea what they’ll use these Data Scientists for. But surely they’ll be able to produce some kind of magic, right?

If you buy a Ferrari, don’t just leave it in the garage.

Great use of that engine

Also, don’t just use it for buying groceries.

You’d likely be better off with a Skoda

Use your Ferrari for what it was built for.

A Ferrari in its natural habitat

Data Scientists are ambitious, intelligent, business-minded people. This means you have to make sure they’re working on problems that are not just challenging, but that have high return-on-investment (ROI).

The Data Science Manager plays a key role here. You have to consistently match the right set of business challenges with the people in your team who help solve them.

Going back to our very first point, it’s tempting to focus first on the challenges that involve the most complexity. In my experience, you should primarily be thinking of value when considering where to invest resources — that is, where to leverage the people in your team. As mentioned before, complexity is a cost, and you should naturally consider that too.

At the same time, Data Scientists are attracted to hard problems. So there is a balance to be mindful of. But the impact someone can have on value for the business, is in my experience a far more motivating ambition than complexity alone.

2.6. OKRs for focus and alignment

It’s just as important to have a good toolbox when you’re a manager as it is when you’re a Data Scientist. And the most powerful tool in my manager toolbox is Objectives and Key Results (OKRs). In brief, OKRs is about setting a handful of ambitious, qualitative objectives, and associating quantitative key results with those objectives. Typically you do this on a quarterly basis. There’s a lot more to OKRs than that, but that’s the essence of it.

OKRs are very powerful, because in a simple way they ensure everyone knows exactly the direction we’re going in, and what we’re trying to accomplish.

They’re also fascinating from a management perspective, because the methodology of OKRs is easy to learn, but surprisingly hard to master. Normally it takes a few quarters before you really get it right: how to set them, follow up, and review.

There are two things I’ve found especially useful as a manager when it comes to OKRs.

First: encourage everyone in your team to create personal OKRs. Your personal OKRs should capture the totality of what you as an individual want to accomplish this quarter. When I say “totality” I mean both your personal growth objectives and your contributions to the organization and team around you. I cannot express how important it is to keep these two things in the same place. It’s such a basic thing, but it’s really what helps you align your personal goals with the company goals.

Want to learn more about LSTMs? Great, let’s have you contribute to that NLP project where we know LSTMs will be used. Keen on improving your presentation skills? You can work on this retention analysis project with marketing. Curious on the manager track as a career path? Try leading this squad working on segmenting users for monetisation.

With personal objectives and company objectives aligned, all your team members will have a one-pager of OKRs that they can literally print out and hang up next to their monitor.

Ideally, all personal OKRs are visible to everyone in the company. This creates a culture where people focus on growth and help each other reach their goals.

Second: help your team members integrate OKRs in their daily and weekly routines. I started using a simple spreadsheet that my team members ended up adapting for their own use. It’s not beautiful, but it works:

“Great” UX

Every Friday before heading home, we spend 10 mins filling this week’s column. What you write isn’t that important — the value comes from the ritual itself. This helps remind you of your top priorities this quarter. Personal OKRs are also invaluable in 1:1s with team members.

There’s no single optimal way that works for everyone when it comes to following up OKRs — the key point is to help your team members find a way to naturally build them into daily and weekly routines.

2.7. Psychological safety comes first

I’ve saved the single most important point for the end.

When Google studied their teams over two years in order to find out what makes some teams perform well and others under-perform, there was one thing that stood out. That thing was psychological safety.

In brief, psychological safety can be summarized as the belief that you won’t be punished when you make a mistake.

Now, reflect on this in the context of the introduction to Part I. The impostor syndrome is very real in data science. And what’s the one thing you fear when you feel like an impostor? Making mistakes.

Over the years, I’ve found that people from many different backgrounds enter the field of data science. In our team at Schibsted, we are fortunate to have fantastic people with a very wide range of experience. People with backgrounds from finance, research, education, consulting, software engineering, and more.

It would be silly to assume that these people all know the same things. On the contrary, the value of having such a variety of experiences is that everyone brings something new to the team.

The notion of the Data Scientist unicorn is poison for psychological safety.

Is there a quick fix to increase psychological safety? I don’t think so. But I do think that it needs to be top of your priority list as a manager — especially when you’re building up a new team, or when you have new members joining. Although there is no quick fix, there are clear actions you can take to increase psychological safety. Here are some that have worked well for us:

  • Create a feedback culture. Make it clear that your team members owe each other “plus and delta” after presentations, sprints, etc. By the way, that includes you as manager! And train people on how to properly give constructive feedback — this doesn’t come natural to everyone.
  • Increase time spent face-to-face. Pair programming, problem-solving on the whiteboard… This is especially important for remote teams. That flight ticket is almost certainly to be worth it.
  • Create pairs or squads instead of solo work. You might end up doing fewer things as a team, but you’ll do those things better. And those working together will build trust with each other.
  • Encourage open and honest discussions in plenary. Work proactively to balance the airtime of all participants — some people might need to be asked to speak up.
  • Be mindful of cultural differences. You might come from an egalitarian, explicit, and direct culture. There’s a good chance you’ll then miss out on signals from a team member coming from a hierarchical, implicit, and indirect culture.
  • Do team experiments for continuous improvement. Involving the whole team itself in the problem of “how do you successfully run a team” gives everyone a sense of ownership for the well-being of the team.
  • Measure happiness and psychological safety. Find some simple way to regularly ask questions related to happiness and psychological safety. If you don’t have a fancy HR system for this, just start small with a Typeform and iterate until you and the team find that it’s useful. Share the (anonymised) average scores or findings with the team, and involve them in how to improve things.

Congrats, you’ve made it to the end! Hopefully this post has been somewhat useful for you as a Data Scientist or Data Science Manager.

We’ve gone through quite a lot, so here’s a re-cap of all the points:

Part I: Data Science in the Real World

1.1. Complexity is a cost — start simple

1.2. Always have a baseline

1.3. Use the data

1.4. Own the data

1.5. Forget the data

1.6. Develop a nuanced view of causality

Part II: Managing a Data Science Team

2.1. A Data Science team is not really a team

2.2. Embed Data Scientists in other teams

2.3. Data -> Power -> Politics

2.4. Take ownership of analytical productivity

2.5. Leverage your resources, always aim for high ROI

2.6. OKRs for focus and alignment

2.7. Psychological safety comes first

Thanks for reading! If this was useful, please consider sharing the post with others. Hope to see your own thoughts on working as a Data Scientist or Data Science Manager some day 🙌

Thanks to Ana Duje for illustrations.

Image credits: Complexity, Baseline, Ferrari, Productivity, Politics

--

--