Pandas Practice Series: Marketing Campaign Success

Level: Hard | Company: Amazon

Rohit Arora
CodeX
4 min readSep 4, 2022

--

picture by Lukas Blaze from Unsplash

Problem Description:

You have a table of in-app purchases by user. Users that make their first in-app purchase are placed in a marketing campaign where they see call-to-actions for more in-app purchases. Find the number of users that made additional in-app purchases due to the marketing campaign's success.

The marketing campaign doesn’t start until one day after the initial in-app purchase, so users that only made one or multiple purchases on the first day do not count, nor do we count users that, over time, purchase only the products they purchased on the first day.

The question is taken from Stratascratch, a site that collects essential interview questions and structures them according to their difficulty level and the company it was asked in.

Data Preview:

Output:

Analogical vs. First-principles Approach:

Easy questions can be solved using brute force or analogical thinking. But, when a question is as elaborate and complex as this, one needs a proper first principles approach, where we break the question into parts and solve it step-by-step.

It is really difficult even to portray what an analogical approach might look like; you hit and trial until you reach the desired output, and all your actions are based on intuition and inferences that make sense only when the problem is easy and as the difficulty of the problem increases the intuition starts to fall apart. The steps that were carried out seem untrackable and leave you with a solution(if you reach the solution) that is more of a Blackbox.

On the other hand, when you approach the problem with first principles thinking, you break the problem into fundamental pieces. And solve each of those pieces one by one and then put them all together. It leads you to have a solution that is highly explainable and easy to understand.

Solution

Let us tackle this problem in a step-by-step manner:

The idea is first to filter users who meet the criteria for the campaign and then check if all the products they bought meet the criteria.

  • The criteria for the campaign are dependent on the number of unique dates as well as product ids for each user. So we group by according to user ids and get the number of unique values for both the dates and product ids as shown below:
  • The question states that the campaign will start the next day, and the purchases of the same products will not be counted. So, we have to consider only those users with unique dates and product ids greater than 1.
  • Now that we have obtained the users which fit the criteria in a separate data frame let us create a few columns in the original data frame for our convenience.

The ‘user_product’ column joins the values of two columns, i.e., ‘user_id’ and ‘product_id’, so that we do not have to write a new condition every time we want the product id for a particular user(as the same product_id can come under various users).

The ‘created_at’ column is modified to get the date into a format that can be easily processed by the transform function in the next line.

The transform function is an aggregate function used in conjunction with the groupby function, but it's different than the other aggregate function like apply, count,..etc. The difference lies in the size of the output returned by it. Most aggregate functions return an output less than the size of the input, but in this case, it returns output the same size as the input.

The question specifies that the product Ids purchased on the date not included in the campaign should not be considered, i.e., the initial/ minimum date at which the user purchased the product. And the ‘first_order’ column consists of the minimum date for every user.

  • Now that we have got all our columns in place, we just need to write a simple condition to filter out the users which fulfill the criteria and, at the same time, eliminates the product ids for every user that they bought before the campaign started.

Conclusion

As you can see for yourself, the problem, when broken down into fundamental pieces and then solved from a bottom-up approach, makes much more sense and seems much more solvable. Although the part where we have to break the problem into smaller pieces does not come easily, one must practice a lot to master the art of first principles thinking. So stay tuned for more such problem-solving posts!!!

And if you are interested in getting your daily dose of ML, DL, and python content in the form of bite-sized chunks so that you can upgrade your knowledge even in your free time, check out my Twitter account👇

--

--