Pizza Runner

Parinita Chowdhary
Parinita Chowdhary
Published in
4 min readFeb 4, 2023

Data Cleaning and exploration using mySQL

In this post, I’m hosting my SQL solution of week 2 of Danny Ma’s 8 week SQL challenge, you can access it here. My solution is also hosted on my GitHub, it can be accessed here.

Data Cleaning

Before starting with analysis, I attempted to clean the values in the tables — customer_orders and runner_orders.

  • customer_orders — removing nulls, formatting order time as datetime field
cleaning customer_orders table
  • runner_orders — removing nulls, units(km/mins/minutes) from distance and duration
cleaning runner_orders table

A. Pizza Metrics

Question 1 — How many pizzas were ordered?

Question 2— How many unique customer orders were made?

Question 3 — How many successful orders were delivered by each runner?

Question 4— How many of each type of pizza was delivered?

Question 5 - How many Vegetarian and Meatlovers were ordered by each customer?

Question 6— What was the maximum number of pizzas delivered in a single order?

Question 7 — For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

Question 8— How many pizzas were delivered that had both exclusions and extras?

Question 9 — What was the total volume of pizzas ordered for each hour of the day

  • created recursive timeslots
  • joined customer_orders table to timeslots and used count to calculate orders in every hour

Question 10— What was the volume of orders for each day of the week?

  • followed similar approach as above question

B. Runner and Customer Experience

Question 1- How many runners signed up for each 1 week period? (i.e. week starts 2021–01–01)

Question 2 — What was the average time in minutes it took for each runner to arrive at the Pizza Runner HQ to pickup the order?

Question 3 — Is there any relationship between the number of pizzas and how long the order takes to prepare?

Question 4— What was the average distance travelled for each customer?

Question 5— What was the difference between the longest and shortest delivery times for all orders?

Question 6— What was the average speed for each runner for each delivery and do you notice any trend for these values?

Question 7— What is the successful delivery percentage for each runner?

C. Ingredient Optimisation

Question 1 — What are the standard ingredients for each pizza?

Question 2— What was the most commonly added extra?

Question 3 — What was the most common exclusion?

Question 4 — Generate an order item for each record in the customers_orders table in the format of one of the following:
Meat Lovers
Meat Lovers — Exclude Beef
Meat Lovers — Extra Bacon
Meat Lovers — Exclude Cheese, Bacon — Extra Mushroom, Peppers

Thank you for your time! There are a more questions in Section D about Pricing and Ratings, that I havent anwered. May be in the next post 😀!

--

--