Working with Mittmedia’s data in 2019 — Creating business and user value through innovation

Magnus Engström
Jan 17 · 7 min read

Mittmedias head of data looking ahead at upcoming undertakings from a data perspective.

A short recap of 2018, and five things in the road map going forward: churn prediction, content classification, geodata, personalization and new content APIs.

2018 was a hectic time for anyone working with data at Mittmedia. During the year we implemented personalization at a large scale on all of our websites, apps and in our newsletters, automating close to everything connected to content publishing in our products. We adapted and standardized a robust and highly scalable meta data structure, better suited for both users and machine learning models. On the paid content side we updated our paywall and found enough confidence in our data science to take the leap and declare all content as “paid content”, locked behind our paywall.

The publishing strategy got better with a planning tool specifically crafted for how Mittmedia’s journalists write content. With this addition to the article production suite we moved the content creation closer to the supply and demand approach, that builds user engagement by evaluating how articles perform. This type of statistics also helps us define our KPI:s.

On top of that that, we were also able to use our data structure to build intelligence into third party tools such as our newsletter service.

But more than anything last year was about building a foundation. For every innovative and progressive task in the data teams backlog we did ten less innovative tasks connected to building a better structure. We scaled our pipelines, reworked our models, increased our database clusters, built new data imports from the ground up, set error alarms, optimized accessibility, integrated API:s and defined strategies for how the organization should work with data. We even got the opportunity to professionalize the setup process for Soldr (our data platform, soldering our ecosystem), which by the end of the year made it possible for a few external partners to set up their own version of the platform.

Johannes Lindén, developer and machine learning researcher at Mittmedias, presenting the current state of auto-categorization for colleges.

Looking ahead, these are some of the big and interesting undertakings for 2019 from a Mittmedia data perspective.

Churn prediction

Finding a value proposition that attracts loyal customers is a silver bullet of any business. But to fully understand what drives acquisition and churn we must create a more general type of user model. Working with paid content solutions and at the same time doing data science and UX research focused on what drives the user base have given us some great insights.

A decision tree for predicting churn, based on Mittmedia’s data.

We made some precise predictions on how net gain would develop under 2018 using Markov Chains to map the customer inflow through acquisition, and by applying decisions trees we were able to see that the activity level of a user stood out as a primer for predicting churn. However, to shorten the time to market on our anti churn development and to better battle the fear of looking at correlation rather than causation we need a more sophisticated architecture for how the user is represented in our ecosystem. Things like cluster- and target group belongings, churn probability and personalized app settings should all live directly in this user model

Classifying content

Auto categorization (teaching a machine to semantically and contextually identify different types of content) will finally be implemented as an integrated part of the content production process. For us at Mittmedia this has presented itself as a deep learning problem, and it have been a long time coming.

Example of research data, reviewing performance of a category classifier trained on Mittmedia’s content.

We have a history in this subject, doing both proof of concepts such as trying out auto tagging in our content writer (Aracua) and pure academic work both at the Mittmedia office and in collaboration with universities. The final piece of the puzzle in solving this challenge at a production level is the ongoing work with adapting our multi classification models to the new meta data structure.

Helping the journalist creating the content by setting the correct data points will be a great addition to the newsroom (and to the end user of course). The most valuable part of this project might however come from the possibility of having powerful tools such as neural networks creating new and non intuitive types of content clusters. Based on the work we do on this topic in 2019 we might be able to see our content archives through the eyes of an application specifically trained to understand what is unique about the articles Mittmedia is publishing, and thereby finding new approaches on how our news products can be presented.

Geographical data

Visualization of a data layer for one of our regions in Västmanland, connected to user activity. The colours signals different levels of activity.

Large parts of Mittmedia’s business models (paid content and advertising) are based on geography. By slicing content and markets by geographical regions we are able to understand our market penetration and the demand for the content we produce. From a advertising perspective the geographical data is the most important data we have, and being able to fine tune advertising campaigns for local markets is the primary tool used by our sellers to give us an edge over its competitors.

Mittmedia’s digital platform is made up of micro services, and to further increase synergies between our editorial and advertising business models we need to build a new service explicitly focused on providing all our products with a universal geographical data layer. With a more robust and scalable solution for geographical data access we can apply new tools to old applications, which could also give us some new market advantages even for the printed newspaper.

Personalization

As I summed up 2018 in the beginning of this post I mentioned that Mittmedia during last year personalized the news feed in public products such as sites and apps. The next phase of personalization will be a long and ever agile tuning process. That might sound tiresome, but this kind of work will lead us down the path of developing more sophisticated types of self-optimizing systems. Already today many of the machine learning processes running at Mittmedia are non-supervised, for example the geographical clustering used for the personalization service.

Processing data to make personalized content recommendations.

The future might however hold even more dynamic solutions, such as doing away with any pre-weighted data points in favor of more evolutionary algorithms (when the system automatically optimizes functionality, like a constantly running unsupervised A/B-test).

New content API

The media landscape is changing, and it always has been. Trends in news consumption follows the general trends for how content is made available, and only in the last ten years the industry has been completely revamped by the smartphone revolution. The difference now is that the shift we see today is happening much closer to how the user experiences information flow. It’s not only that we see new types of devices emerging, the content itself is taking new forms.

When text, video and images starts to give way for products such as conversation based services and devices built to provide a virtual reality the revolution in user experiences is set to take place close to a neurological level, the experience of taking in information will be different. To acquire information by taking part of a conversation or walking around in a simulated environment is from a user perspective a completely different process than reading a text.

Mittmedia will not be able to run parallel projects to cover all potential research areas that might define the future of news consumption. Instead, to avoid long runways for new innovative projects, our first step will be to redefine our content distribution at a base level. Traditionally we have based our content API:s on delivering primarily text content filtered by things such as categories and newspaper titles. Mittmedia’s future content distribution will need new solutions, and as of January first we have started the project that aims of giving us those solutions.

Future plans for content API development.

During 2019 we will build a new API, with endpoints crafted to deliver content based on things like spatial searches (geography) and hierarchy based article lists (such as dynamically created editions such as a recommended reading list or a curated newsletter). In many scenarios we might also find ourselves developing products that will not actually use the text content in our articles at all and only focus on granular metadata. A metadata only approach might for example provide a good foundation for a natural language solution aimed at smart home speakers.

Mittmedia have a tradition of both hosting and engaging in hackathons and developer meetups. Here is Stefan Wallin, Jenny Vesterlund, Michelle Ludovici, Pontus Ekholm and Johannes Lidén at Good Tech Hack 2018.

The goal not only to give the developers at Mittmedia better tools to build new innovative products but also to open up for others to also build products on our platform. Mittmedia have a long tradition of working closely with local developers and businesses through hackathons and university projects. This project will give us the possibility of doing this at a larger scale, and for anyone looking to build a bleeding edge news application Mittmedia will be the obvious choice as content provider.

A final note

Mittmedia has very ambitious plans for future projects, but we also have a good track record of taking on innovative data projects and putting them into production. When 2019 wraps up we might look at the list of projects above and conclude that not everything got done as planned, but experience tells us that if that is the case the reason will be that we found something more rewarding and interesting to do instead.

Thanks to Katarina Ellemark and Michelle Ludovici

Magnus Engström

Written by

mittmedia

mittmedia

Vad händer i Mittmedia?

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade