Q#15: Animal weights

Suppose you’re given the below tables, containing animal weights, as well as a directory of zoos:
Table name: animal_weights

Table name: zoo_directory

Given the above, write a SQL query to return the second heaviest animal in the table from each zoo. In addition to the zoo_ID, return the zoo_name from the zoo_directory table.
Click here to view these tables in an interactive SQL fiddle.

-Credit: erik@interviewqs.com

TRY IT YOURSELF

ANSWER:

This is a rather hard SQL query and requires your understanding of table joins and cleverness in understanding SQL table manipulation. For this question we are going to use MySQL, which has minor syntax differences from the standard SQL, but none that affects this problem.

Recall in SQL, the key statements are SELECT (to query columns) FROM (to give reference to table) and WHERE (to put conditions). Since the goal is to return zoo names as well we will need a JOIN (as the name suggests) to combine two tables.

First let’s just focus on querying the correct data we want, so let’s use the SELECT statement with the proper table schema (very important, even if it creates messier looking code, its general good practice to attach the table name when querying across joins) ie. Table_Name.Column_Name.

SELECT animal_weights.weight_lbs, zoo_directory.zoo_name
FROM animal_weights

Next, we need to JOIN the two tables together on a mutually shared column. Recall there are four types of join:

Here we want a RIGHT JOIN because we selected from the animal_weights table first and the zoo_directory table is smaller but we want its information to distribute. A join statement is always the table you want to join followed by ON then the shared key (again best practice is to use the aforementioned TN.CN nomenclature).

SELECT animal_weights.weight_lbs, zoo_directory.zoo_name
FROM animal_weights
RIGHT JOIN zoo_directory ON animal_weights.zoo_ID=zoo_directory.zoo_ID

Now we have a query that returns to us a table with two columns the animal_weights and the zoo_name. Next is the challenging part, where we need to obtain the second highest weight of animals in each zoo. To do this we are going to use the WHERE clause on a nested query containing a GROUP BY clause and the built-in SQL MAX function twice.

SELECT MAX(animal_weights.weight_lbs), zoo_directory.zoo_name
FROM animal_weights
RIGHT JOIN zoo_directory ON animal_weights.zoo_ID=zoo_directory.zoo_ID
WHERE weight_lbs NOT IN
(
SELECT MAX(weight_lbs)
FROM animal_weights GROUP BY zoo_ID
)

There is a lot to unpack here. As you may have guessed the MAX function returns the maximum value of that column element. The basic strategy here was to query the max values for weight after excluding the max values for weight thereby obtaining the second highest values. The WHERE clause is limiting weight values to be those not in max weight values per group of zoo_ids (hence the GROUP BY statement, if you run the query in parenthesis separately you can see exactly what is happening).
Note: In the parenthesis I do not attach table names as the sub-query is coming from one table only.

Finally, this query is still not done, we are missing some finishing touches including an alias name for good practice purposes and a final GROUP BY clause to make it per zoo_name.

SELECT MAX(animal_weights.weight_lbs) AS Second_Highest_Weight, zoo_directory.zoo_name
FROM animal_weights
RIGHT JOIN zoo_directory ON animal_weights.zoo_ID=zoo_directory.zoo_ID
WHERE weight_lbs NOT IN
(
SELECT MAX(weight_lbs)
FROM animal_weights GROUP BY zoo_ID
)
GROUP BY zoo_directory.zoo_name

--

--