Three free Marketing Data Science Tools that let your eyes glow

Florian Grüning
kuwala-io
Published in
4 min readJun 21, 2022

In marketing departments, the motivation of managers to work data-driven is high:

  • Marketing Channels are very often digital (Facebook Ads, Instagram Influencer, and TikTok Videos) with performance indicators and target group information.
  • Most important touchpoints moved into a website or app with a lot of information on user interactions.
  • Ultimately, marketing is there for a business to scale sustainably and efficiently. To do so you need to know your metrics, KPIs, and their impacts and evolvement over time (forecasting).

Marketing Data Science is also challenging because you don’t really know where to start. In this article I’ll introduce you to my favorite off-the-shelf solutions I found on Github. Yes there is some coding involved but the proposed solution you can use for free, unlimited and they use very advanced Data Science methods. With basic knowledge in Python, R and Software Engineering you would be able to implement pretty advanced solutions without knowing all the details under the hood. Here we go with three of my favorite off-the-shelf solutions:

Product Recommendation on Your Website with Metarank (https://github.com/metarank/metarank)

Metarank is a tool that helps you easily build an advanced recommendation engine for your products or content on your website. To get started you only need historical performance data of your products (e.g. number of clicks) and additional metadata like product rating, genre, ingredients or price. In a YAML file, you define the features and the model parameters (e.g. number of iterations, modeling technique). The API service integrates with Apache Flink and can be easily integrated into Kubernetes clusters.

A IMDB Database ranked with Metarank by different parameters (see scary movie example)

In summary, you can achieve a steep learning curve with Metarank because you can easily define the parameters and there are good examples and an active community. For the deployment, you need some software engineering skills.

User Journey Analysis on your Website with Retentioneering (https://github.com/retentioneering/retentioneering-tools)

Retentioneering helps you to understand the user journey on your website. Retentioneering is a Python library that allows you to easily connect your Google Analytics data (in Bigquery). You define user-id, event-type and time stamp. From this data input a comprehensive graph network is created with gains and losses as you know it from a customer journey. In addition, customer segments are created that have a similar customer journey. This reduces the complexity of a purely descriptive view of the data.

Example User Journey Graph with accomplished payment and loss throughout the process
User Journey Network Graph based on Website Event Data

In summary, Retentioneering as a Python library is aimed at Python users. Running the functions in Python is easy though. However, the quality of the results stands and falls with having good event tracking on the website.

Marketing Mix Modeling with Robyn (https://github.com/facebookexperimental/Robyn)

Less third-party cookie means less attribution models. The answer to this is Marketing Mix Modeling. Marketing mix models are regression models that use statistical probability to calculate the effect size of marketing channels and other independent variables. The advantage is that business context can be modeled much more realistically. For example, Google Searches for the own brand can be integrated to determine the share of the own brand strength in the revenue. Likewise, offline advertising measures can be modeled with other metrics in this context (e.g. offline advertising with GRPs). Robyn takes into account adstock effects, ROAS calculation and multicollinarity in the marketing channels. In addition, with simple functionality, budgets can be optimized using the predictions and results from marketing tests can be integrated into the model for calibration.

Share of Spend vs. Share of Effect and Channel ROAS/ROI

In summary, Robyn is currently an R package. A Python version will follow in the next releases. With clean data on a day-level and information about impressions and spent as well as revenue data you can start right away!

Now, you might also check out Kuwala. My Open-Source Tool that you can use to build data pipelines that fit the data into the bespoken Off-the-Shelf Models. See our Open Source Project, here: https://github.com/kuwala-io/kuwala

--

--

Florian Grüning
kuwala-io

I am all about content on how to enable fastly complete analytics workflows for companies 🚀