Q#32: Fraudulent Retail Accounts

Below is a daily table for an active acount at Shopify (an online ecommerce, retail platform). The table is called store_account and the columns are:

Here’s some additional information about the table:

  • The granularity of the table is store_id and day
  • Assume “closed” and “fraud” are permanent labels
  • Active = daily revenue > 0
  • Accounts get labeled by Shopify as fraudulent and they no longer can sell product
  • Every day of the table has every store_id that has ever used Shopify

Given the above, write code using Python (Pandas library) to show what percent of active stores were fraudulent by day.
-Credit to:
team@interviewqs.com

TRY IT YOURSELF

ANSWER

This question tests our ability to use the Pandas library in Python. Pandas is a library that allows us to organize and interact with data and tables in a dataframe format, which is quite similar to a table with column and row indices.

To answer this question, let’s break it down into chunks. First lets get a count of the total number of active stores. To do this we can use indexers [] to extract the rows by some conditional statement. Then we can use the .count() method to count the number of rows remaining. Here active stores are anything with daily revenue > 0 and our dataframe is called store_account.

active_stores = store_account[store_account['revenue'] > 0]
active_stores_count = active_stores.count()

Next, lets obtain the number of fraudulent stores within the active_stores dataframe again using indexers and count. Finally, lets divide by the active_stores_count to get the final answer.

fraud_stores = active_stores[active_stores['status' == 'fraud'] 
fraud_stores_count = fraud_stores.count()
fraud_stores_count/active_stores_count

--

--

--

Data Science tutorial working through solutions to Data Science Interview Questions

Recommended from Medium

Delta Lake in Action: Upsert & Time Travel

How can big data be used for corporate responsibility initiatives?

Fooled by Randomness, Over-fitting And Selection Bias

Discovering GeoDB 1. The power of place

20+ Data Science Projects with Code

Visualizing Geospatial E-Commerce Sales Data with kepler.gl

Data and Nonprofits: Putting it all into context

Questions to ask during your ‘Data Scientist’​ Job Interviews

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Abish Pius

Abish Pius

Data Science Professional who teaches with unflashy, simple to understand python code.

More from Medium

Four-strokes in Europe: Analyzing Eurostat vehicle dataset

Never build a model, before looking at your data.

Modern Day Data Science:

Data-centric approach