Online Ad Click Prediction: Machine Learning Approach

Concept explained through example and code

Shikhir Dodeja
Geek Culture
4 min readAug 2, 2021

--

Source: https://outsideroi.co/do-click-through-rates-matter/

The online advertisement industry has become a multi-billion industry and predicting Ad CTR (click-through rate) is now central to it. Nowadays, different types of advertisers and search engines rely on modeling to predict ad CTR accurately.

In this blog, we will be predicting the ad click-through rate using the machine learning approach. Before that, let us first understand few important concepts and a general approach followed by searched engines to decide which ads to display.

CTR: It is the metric used to measure the percentage of impressions that resulted in a click.

Search ads: Advertisements that get displayed when a user searches for a certain keyword. Paid search advertising is a popular form of Pay per click (PPC) advertising in which brands or advertisers pay (bid amount) to have their ads displayed when users search for certain keywords.

Relevance of Predicting CTR through a real-life example:

Typically, the main source of income for search engines like Google is through advertisement. There are tons of companies that pay these search engines to display their ads when a user searches for a certain keyword. Here, our focus is on search ads and CTR, i.e. the amount is paid only when a user clicks on the link and redirects to the brand’s website.

Different advertisers approach these search engines with their ads and the bidding amount to display their ads. The main objective of these search engines is to maximize their revenue. So the question is how does a search engine decide which ads to display when a user searches for a certain keyword?

First, it calculates the probability of a click by a user given the features- ad content, user, and context.

But calculating the probability of click is not enough, as the goal of the search engine is to maximize their revenue. In order to achieve that, they need to multiple this with the bidding amount then see which ad will benefit them the most.

And that's how we get our

Till now we have seen is what is ad click prediction, and why is it important. Let us now explore how to calculate ad click prediction by performing machine learning modeling on a dataset. We will build a Logistic Regression model that would help us predict whether a user will click on an ad or not based on the features of that user. and, hence calculate the probability of a user clicking on an ad.

Using these probabilities, search engines could decide which ads to display by multiplying the probabilities with the bid amount and sorting it out.

Step 1: Dataset and importing Libraries

Dataset Features

You can find the dataset here.

Importing Libraries

Step 2: Loading dataset and printing first 5 observations

Step 3: EDA

High cardinality in city, country, etc.
No missing values
Target Variable is balanced
Multiple pairwise bivariate distributions
Correlation Table
Heatmap for correlation

Step 4: Train-Test split

Step 5: Training Logistic Regression Model

Step 6: Checking Model accuracy

Interpretation: The model is doing a decent job as per the accuracy score i.e. around 90%, results supported by confusion matrix and classification report.

You can find the code here.

Step 7: Final Step

Finally, we have trained a logistic regression model and calculated the probabilities for ads that are predicted to be clicked. In reality, we will compare the models with log loss as we need actual probabilities in this case.

Once we have got the probabilities, we will multiple those with the bid amount given by each ad representative. The resulted product will then be sorted out in descending order and the top 3 would be chosen to be displayed as ads by the search engine.

This is the concept of ad click prediction and how the companies maximize their revenue and at the same time offering adequate results.

--

--

Shikhir Dodeja
Geek Culture

Learner for Life | Learning data science, machine learning, analytics, statistics and so on |Here to share my knowledge with everyone.