Optimizing Social Media Effects at SZ

Published in

Süddeutsche Zeitung Digitale Medien

6 min readOct 10, 2023

How data analysis and experiments with Instagram posts and reels can help drive business growth

In today’s digital landscape, businesses often grapple with questions surrounding their social media strategies. Are we reaching potential customers effectively? Which topics should we focus on to maximize our reach? Are we getting the most value out of our social media budget? If these concerns resonate with you, you’re not alone. In this article, we will delve into how Süddeutsche Zeitung (SZ) tackled these challenges using data analysis and experiments on Instagram.

Which topics work best on Instagram reels?

There is a challenge for any data scientist who aims to evaluate the business impact of social media posts beyond the platform’s own metrics: social media data is usually disconnected from main company data.

For the data science team at SZ this means that we cannot reliably trace the number of views and sales from our articles back to the reels on Instagram. However, we can use existing information on posts and articles, such as time of publication and topic, and match posts to articles based on their content.

Let’s take a look at descriptives first. We use hourly observations from more than 70k articles in the first week after publication (~12.5M observations). The figure below shows the average article views per hour for articles with and without social media reels and articles (plots are redrawn).

Articles with Instagram reels receive a higher number of views on average. Does that mean that the reels help articles perform better? — No! The reason is that reels are probably posted with articles that do perform better anyway. In other words, the social media marketing colleagues will select those articles for reels which they think will have the greatest impact.

This is a crucial insight and shows once again that for decision-making one should not rely on simple correlations such as “articles that have received social media promotion tend to perform better”.

How can we alleviate this selection problem and use the data to gain actionable insights on what topics to post about? We use another piece of information in our data: the publication time of the Instagram reel, and compare performance of an article before and after that time. This ensures that level differences between generally well-performing articles and other articles are netted out.

There is one problem left, though. As one can see from the descriptive figure, the average total views of articles tend to decline after publication. Comparing views before and after the reels thus seems like an unfair comparison to make. A fair comparison should contrast article views in the same hour after publication of the article.

Fortunately, there is a method in the econometrics toolbox that can address both issues of selection and time trends simultaneously: the two-way fixed effects model.

This method allows to both compare only views of articles within a given article before and after the reels, plus compare article views only within a given hour after publication of a news article. The result is a statistic that is much closer to a causal estimate of the effect of boosting an article with an Instagram reel.

In mathematical terms, regress the number of views of an article i at hour t on a fixed level for each article (α) and each hour relative to article publication (φ), plus an hourly increase of β after reels are posted and an error term ε:

As there are quite a few articles and hours in the data, we need a python library that can perform estimation with high-dimensional fixed effects. We choose the FixedEffectModel python library to estimate β and cluster standard errors at both the article and hour dimension.

from fixedeffect.fe import fixedeffect
views_fe_model = fixedeffect(
                    data_df = views_data,
                    dependent = ["total_views"],
                    exog_x = ["post_reel_indicator"],
                    category = ["article_id", "relative_hour"],
                    cluster = ["article_id", "relative_hour"])
views_results = views_fe_model.fit()
views_results.summary()

So, how do we know what topics to use in our Instagram reels? We estimate a separate reel effect indicator in our two-way fixed effects model for each topic of the reels posted so far and get the following results:

As the graph shows, Instagram reels on topics such as travel, media, sports, politics, and society perform best in terms of boosting article views.

How much money should I use to boost Instagram posts?

Instagram allows you to boost posts to show them to a wider audience. However, what is the optimal amount of money you should spend on posts? We decided to run an experiment to find out.

In our experimental setup, we vary the amount of boosting for one group of posts daily while keeping the amount of money spent on another group of posts fixed. After collecting a few days of data we can already observe the following relationship between the amount of money spent and the number of links clicked in the bio, as well as the conversions generated on Instagram:

There is a clear positive correlation between spending and engagement, and we do find the same for conversions to our SZ subscription plans. This is good news — but what is the optimal amount that we should choose for boosting?

To arrive at the optimum, we calculate the marginal effect (at mean) of an additional Euro spent on Instagram boosting. In the graphs above, we show a linear fit line for the data. However, we think that it is reasonable to assume that there are diminishing returns to scale of boosting and use a quadratic fit. In other words, we assume that the first Euro spent on boosting is likely to result in more views and conversions than the 100th Euro spent.

The amount of spending that maximizes our profit occurs where incremental revenue equals incremental cost, i.e., one additional Euro invested in Instagram still yields about one Euro from conversions. Given diminishing returns to scale, we can re-arrange this equation and plot the following relationship between optimal spending and the value of a conversion:

There are three insights from this graph. First, we see that spending any amount of money on boosting only pays off after a certain threshold. This is intuitive, since spending money that does not result in a single conversion is wasted.

Second, the optimal spending function becomes almost flat at certain higher values of spending — a feature implied by the diminishing returns to scale assumption.

Third, the curve tells us the optimal spending for a given monetary value of a conversion. In our case, we could slightly increase spending.

In this article, we showed how we leverage data analysis and experiments to drive business growth more effectively. Data-driven approaches have significantly improved our social media strategy and we believe they have the potential to profoundly impact digital media businesses.

Optimizing Social Media Effects at SZ

Written by SZDM Data Science