SQL Insider- Case I- How to solve the SQL Murder Mystery using only two queries!

If you’re already familiar with SQL or just dipping your toes into the world of database querying, you’ve probably heard of the notorious SQL Murder Mystery set in the adventurous SQL City. But what if there’s a clever way to crack the case using just two queries? Follow along till the end to discover the secrets behind the SQL City’s most baffling crime!

Ruchita Shah
7 min readJul 23, 2023
Uncovering the mysteries of SQL City!

Warming Up

Like glass is to Sherlock Holmes, SQL is to a Data Scientist! 🕵🏻

SQL (Structured Query Language) is a powerful query language that allows a standardized way to interact with databases, allowing creation, retrieval, updating and deletion (CRUD) of data efficiently. Data analysts and data scientists heavily rely on SQL for their day-to-day tasks due to its ability to extract valuable insights at a glance from large datasets. It is vastly used to perform queries of filter, aggregation and joins to uncover complex business questions and gain deeper understanding from the available information.

Before the development of SQL (Structured Query Language) in the 1970s, there were various other query languages used to interact with databases, such as QUEL (Query Language), DML (Data Manipulation Language), QBE (Query by Example), ISBL (Information Systems and Business Language) etc. While these early query languages were used for specific database systems, SQL emerged as a standard language that could be used across different RDBMS, leading to its widespread adoption and becoming the reliable go-to for interacting with structured databases.

Getting Started

Well, now that we have a hold on what exactly is SQL and how it emerged to be one of the must-have skills for Data Analysis, let’s get our detective hats on to uncover the mystery behind the SQL City’s most baffling crime!

The SQL Murder Mystery

My advice, try it for yourself before going ahead here, to relish the thrill that comes with learning by doing! Don’t worry if you stumble, you can always come here and find your way back!

Sleuthing Begins

If you’re just stepping into the realm of SQL, going through the Beginner’s Walkthrough on their website will be immensely helpful in building foundation of core concepts like select *, where clause, wild card etc.

Remember: Before delving too deeply into finding a solution, it’s always wiser to start by asking questions to identify the actual problem we seek to solve. This not only brings clarity to our objectives but also conserves valuable resources like time and energy by minimizing trial and error.

1. What’s the case? (Problem Statement)
A crime has taken place and the detective needs your help. The detective gave you the crime scene report, but you somehow lost it. You vaguely remember that the crime was a ​murder​ that occurred sometime on ​Jan.15, 2018​ and that it took place in ​SQL City​.

2. What are the places we should be looking at? (Data Points and sources available)

Schema Diagram (ERD)

3. How do we get started? (Determine the first step)
Start by retrieving the corresponding crime scene report from the police department’s database. Also, study the table’s contents at a glance using this query, to have an idea of what we’re starting with: SELECT * FROM sqlite_master where type = ‘table

4. How do we know we solved the case? (Determine what success looks like) Identify the murderer’s name and run the following query to find out the results:

Check for success!

Querying

Now that we know enough about the case, let’s start our sleuthing and dig up further information.

Let’s begin with the longer route to get to our destination, because this allows room for exploration and expands the scope of our learning.
At the end of this article, you’ll find out how to reach the killer in just two queries, so let’s get started.

We will structure our search in the following order, so it’s easy to follow the leads.
Process: Think about what we have and what we need
Query: SQLite query to get the results
Output: Result of the query.

Lead 1
Process:
As we know from our preliminary questionnaire, we need to start by looking for the crime scene record in the police department’s database for a murder that took place in he SQL city at 15th Jan 2018.

Query: select * from crime_scene_report where type = ‘murder’ and date = ‘20180115’ and Lower(city) like ‘sql city’;

Output:

Lead 1: The Two Witnesses

Lead 2
Process:
From the previous query, we now know about two witnesses to this case:
a. Someone from Northwestern Dr (last apartment)
b. Annabel from Franklin Ave
Let’s find out more about them

Query: select * from person where address_street_name = ‘Franklin Ave’ and name like ‘Annabel%’ or (address_number = (select max(address_number) from person where address_street_name = ‘Northwestern Dr’))

Output:

Lead 2: Details of the witnesses.

Lead 3
Process:
Now that we know their ‘person_id’ (which is the foreign key to most of the tables except drivers_license), let’s see what they had to say in their interviews.

Query: select a.person_id, b.name ,a.transcript from interview a left join person b on a.person_id = b.id where a.person_id = ‘16371’ or a.person_id = ‘14887’;

Output:

Lead 3: Witnesses’ Interview Statements

Lead 4
Process:
We know enough information to identify our suspects for this case. Let’s dive into the gym’s membership data to list out our suspects.

Query: select * from get_fit_now_member where id like ‘48Z%’ and membership_status = ‘gold’;

Output:

Lead 4: Suspects Identification

Lead 5
Process:
Since we’ve now identified all those who fall under suspicion, let’s see who from these have checked in to use the gym on Jan 9 2018, just like Annabel (witness 2) mentioned in her interview. We will identify our suspects using their ‘membership_id’.

Query: select * from get_fit_now_check_in where check_in_date = ‘20180109’ and (membership_id = ‘48Z79’ or membership_id = ‘48Z55’);

Output:

Lead 5: Cross checking with Witness 2’s testimony.

Lead 6
Process:
Now that our suspicions have been confirmed, let’s verify our results with Morty’s (witness 1) statement too. Here, we will use the drivers_license information, to validate the results we acheived.

Query: select a.id, a.name, a.license_id, b.plate_number from person a left join drivers_license b on a.license_id = b.id where a.id = ‘67318’

Output:

Lead 6: Name the killer.

Lead 7
Process:
Looks like we’ve reached our goal! But have we yet? Let’s see what our ‘check_solution’ query has to say about this

Result 1: Jeremy Bowers

Challenge Accepted! Let’s hear out what Jeremy has to say.

Query: select a.person_id, b.name ,a.transcript from interview a left join person b on a.person_id = b.id where a.person_id = ‘67318’;

Output:

Lead 7: The real murderer

Final Lead
Process:
Thanks Jeremy, that was quite the description! Let’s see how can we identify the mastermind in just one query, followed by a verfication of our results. Finally, we’ll ask ‘check_solution’ if we’ve made it to the mastermind of the plan successfully.

Query:
(a) Mastermind Demographics
select distinct(a.id), a.name, d.height, d.hair_color, b.annual_income, d.plate_number, d.car_make, d.car_model from person a left join income b on a.ssn = b.ssn left join facebook_event_checkin c on a.id = c.person_id left join drivers_license d on a.license_id = d.id where c.date like (‘201712%’) and c.event_name like (‘SQL Symphony%’) and d.car_make = ‘Tesla’

Lead 8: Spot the mastermind

(b) Verify the result with Jeremy’s statement
select a.name, d.height, d.hair_color, b.annual_income, c.date, c.event_name, d.car_make, d.car_model from person a left join income b on a.ssn = b.ssn left join facebook_event_checkin c on a.id = c.person_id left join drivers_license d on a.license_id = d.id where c.date like (‘201712%’) and c.event_name like (‘SQL Symphony%’) and d.car_make = ‘Tesla’

Verify the mastermind with clues provided by Jeremy

The Solution, you ask?

Case Resolved!

Finally, the shortcut

Now that we’ve travelled the long and scenic route, let’s see how we reach there in 2 simple steps:

For the witnesses to have given their testimony, all of them they should have been together in the same place when the incident took place, i.e on the evening of January 15, 2018, perhaps attending the same event.

Short Route: The Event

Our witnesses were enjoying their evening at ‘The Funky Grooves Tour’. Now we just need to find out who else was accompanying them.

Short Route: The Murderer

Et Viola ! You have the name of the killer who carried the incident out. And well, now you know how the story ends…

Wrapping Up!

Since this was a hypothetical situation, we got lucky and found a shorter way to our desired outcome, but in the real-world, seldom that is the case.

It’s a recommended practice to start by asking the right questions, note the key findings, report the findings to the stakeholders and collect their feedback as early on as possible in analytical projects.

Analysis is the key between insights and decisions.

--

--