Prediction of LTV for Google Ads bidding optimization

Andrey Osypov
Beards Analytics 🇺🇦
5 min readJun 24, 2022

Today we want to talk about a big problem in many businesses such as Fintech, Crypto, or SaaS. The time between the registration of a user and his useful action for business (for example, replenishing a deposit, activating a card, or subscribing to a product) is quite significant. Although the registration itself often occurs on the first visit to the site/app or the first day.

Therefore, when it comes to optimizing advertising campaigns, using not the fact of registration but beneficial action. A substantial time lag does not allow algorithms, for example, in Google Ads, to conduct optimizing advertising campaigns well.

A good solution would be to know at the time of registration which user will perform this beneficial action and, if possible, how much this user will bring to the business. Such data made it possible to solve several problems at once.

Firstly, it improves Ad bidding algorithms in Ad networks and optimizes Ad campaigns to attract more relevant audiences.

Secondly, knowing which clients will give what result, it is possible to use different communication channels with the client. For less valuable users, you can use remarketing, and for those who can potentially bring a lot of money, you can call and conduct personal product presentations.

Lastly, it allows you to organize profit at current costs and manage marketing costs more stably.

Together with our ML partners (https://www.beinf.ai/), we described how you could predict the practical actions of users in businesses with a long decision time.

General scheme of work

It is essential to have as much data to predict user actions. You can use any data:

  • data that the user provides during registration,
  • data about the behavior on the site/app,
  • data the user used by,
  • open data on the economic state of countries.

All this data must be stored somewhere, regularly updated, and actualized. We create a Data warehouse (DW) as a single source of truth. We use a serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility — Google BigQuery. The most considerable amount of data, as a rule, is data on client events from an application or website. Where Google Analytics 4 has a native and, most importantly, free connector to Google BigQuery.

Preparing data for models

We use DBT to work with data stored in Google BigQuery conveniently. Due to incremental_strategy = ‘insert_overwrite’ in DBT, it is possible to work with incoming data for a low cost.

The data must be valid! It is possible to check their correctness at each stage of the transformation by using DBT. Combine data from various data sources and get a single table that you can feed into the work of the ML algorithm, which calculates the probability of conversion for each user.

Features engineering

In order for the model to make predictions about the user’s LTV, we should pass to the model as much useful information as possible. Each information about the user that is used in the model is called a feature and the more useful features we give to the model, the more accurately it will be able to predict the user’s LTV.

Examples of features that we can create: which country the user is from, how many pages he visited on the website, the number of days since registration, user’s device info, etc.

Model validation

In order to evaluate how well the model performs, we run it on existing data, get its predictions, and see how closely predictions match the actual values.

There are a lot of model evaluation metrics, for example, there is a metric called “accuracy”, which shows the percentage of the model’s right predictions.

Models are also evaluated on data that it has not seen before, and thanks to this, we can accurately determine its predictive performance.

Automation of data transfer in Google Ads

It is extremely simple to create a conversion in Google Ads.

There are two ways to do this:

  • Conversions from clicks using Google Click Identifier (GCLID)
  • Enhanced conversions for leads

The first option involves sending custom conversion data and Google Click ID. An identifier automatically appears in the query parameters when switching from Google Ads.

When the user opens the site for the first time, the client_id from Google Analytics 4 stores in the cookie, by which, in the future, it is possible to link between the ad click and the predicted conversion value, for example, LTV.

The second method (Enhanced conversions for leads) involves transferring personal data, such as email or address (first name, last name, postal code, and country are required if you choose to use this data).

You can upload the conversion values with user ID manually or using API.

Results

Finally, we need to evaluate how well the model performs and in what cases it makes wrong predictions. To calculate LTV, the model predicts whether the user will have conversion or not.

In order to make this evaluation, we build a matrix, that is called the confusion matrix. Each row is a prediction of the model (Positive — the model predicts that the user will have a conversion, Negative — the model predicts that there will be no conversion), and each column is the actual value ​​(True — there was a conversion, False — there was no conversion) and in each cell is the percentage of users. For example, in the picture above, in the False Negative cell, there is 86.6%, which means that for 86.6% of users, the model said there would be no conversion, and in fact, it wasn’t.

If you have a considerable budget and think that you haven’t the possibility of increasing ROI, try to work with our data engineering and ML teams.

--

--