7 MySQL statements to improve your queries

Emília Galdino de Oliveira
8 min readOct 3, 2022
Photo by Alejandro Castillo López on Unsplash

This past week I’ve been trying to review some theory from SQL, and I picked some of the most important statements to know if you want to improve your queries. We start with a quick story time and then you are presented to seven statements from the most basic to the most challenging one.

Story Time

What ended up being known worldwide as the Structured Query Language started as an attempt to adapt the relational model, proposed by Edgar F. Codd, to retrieve data from the IBM database management system. Back then the relational model was an abstraction that wasn’t being used in any particular tool. The first version of the language was named SQUARE, acronym for Specifying Queries in A Relational Environment, and the second version of this language was named SEQUEL. If this sounded familiar, you’ve guessed right, the reason why we spell SQL as “sequel” and not as “s-q-l”, is to reference its former version. Actually the name SEQUEL was already being used by another company, and so they decided to keep only the consonants from the SEQUEL, and only after some time the Structured Query Language acronym was attributed to it.

Basics

Let’s start with the classic SELECT FROM WHERE. The fundamental SQL statement serves as the base to the more complex ones that are actually built on top of this very simple yet effective structure. So let’s dive deeper into each one of its parts. The traditional query starts with SELECT, followed by a list of table columns that will be used to compute the query; the FROM defines the table where the information will be retrieved from, and the WHERE defines a condition that will filter rows from the table.

A sample query that selects all samples of data from Brazil can be represented by:

SELECT country_name, population, gdp
FROM countries
WHERE country_name = ‘Brazil’;

Different from most programming languages when matching a string to a column the operator used is = instead of ==. And even though SQL is not case sensitive, it is a good practice to capitalize the clauses, to improve legibility.

1. SELECT DISTINCT — Unique values

In some cases a table can contain repeated data entries, and when we need to know the unique set of values of that type of entry we can use the SELECT DISTINCT statement. In this case we are dealing with sales data, and each row represents a single product that has been purchased. If we use the SELECT statement the query will return a list with duplicate entries because each product can appear more than once in the rows. In order to filter only the unique values, we add the DISTINCT to the SELECT clause.

SELECT DISTINCT product_name
FROM sales_data;

2. ORDER BY — Order values

This clause is often used when we are dealing with problems where the order of the values is an important part of the data. For example, if we have a table of sales, where each row represents a different franchise of a restaurant and we want to query the top 5 restaurants from this chain with the highest profits in September. To query this result we’ll have to filter the values by month, and also order the results by sorting them by their profits.

SELECT franchise_name, profit, month
FROM franchise_sales
WHERE month = ‘September’
ORDER BY profit DESC
LIMIT 5;

The ORDER BY will sort the values using their profit, and the DESC makes this sorting in descending order. In order to get only the 5 high profit restaurants we can use the LIMIT to literally limit the amount of rows that will be returned by the query.

3. Nested queries

A nested query is exactly what it looks like: a query inside another query. Traditionally a query follows the SELECT FROM WHERE clause, but sometimes we want to select information that is not already available in the table. In order to do so we’ll have to perform another query before using its resulting table in the main query. Let me show you an example: imagine we have a table with the grades from students from different courses to the calculus class. As columns we have: name, age, sex, grade and course. If we want to get the name of the Computer Science student with the higher grade among other Computer Science students we can do it with the following nested query

SELECT name
FROM calculus_grades
WHERE grade = (SELECT MAX(grade) FROM calculus_grades WHERE course= ‘Computer Science’);

In this case the inner query inside the WHERE clause will search the higher grade using the MAX function from all of the CS students in the table. This result is then used in the outer query that will only return the name of the CS student with the higher score.

A quick disclaimer about nested queries, is that if we have a subquery that will be referenced many times in other queries, to save processing time we can make a view out of the query. You can learn more referring to the MySQL view documentation.

4. REPLACE

The replace clause is used to replace characters in a string. Now, imagine we are running a local business and due to a change in our company we want to update our brand name. But we already have a ton of data that was stored under our former company name, and we want to update them. If we were to use only a string operator to do so we could use the replace operator to perform that.

UPDATE table
SET brand_name = REPLACE(brand_name, ‘old name’, ‘new name’)

In the above clause we are using the UPDATE clause to our table to update values from the brand_name column. The REPLACE function has three parameters, being the column_name, the expression to be replaced, and the replacement.

5. CASE WHEN — A switch query

The CASE WHEN structure is useful when we want our output to be adjusted to a set of conditions. Continuing on our calculus grades table, if we want to convert the 0 -10 grades to concepts, we could use a CASE WHEN structure to do so.

SELECT grade
CASE
WHEN grade >= 0 and grade < 4 THEN ‘E’
WHEN grade >= 4 and grade < 6 THEN ‘C’
WHEN grade >= 6 and grade < 8 THEN ‘B’
ELSE ‘A’
END AS concept
FROM calculus_grades

The CASE WHEN is similar to a set of if else conditions, but in this case we always have to specify the last condition as an ELSE statement, and the other ones as WHEN (condition) THEN (result). In the query above we are creating a new column named concept whose values will be computed inside the CASE WHEN conditions.

6. String Matching

If you want to filter some field from the table by a specific string match, when using MySQL you can use the REGEXP clause. Its structure follows some rules that define the way the strings are matched. Some of its most common operators are described as follows:

*- zero or more characters
+ — one or more characters
. — any character
? — zero or one instances of the string preceding it
^ — beginning of the string
$ — end of the string

If you want to learn more about regular expression patterns I suggest this post that was also my reference to the patterns I mentioned above.

Now it’s time for an example, imagine you are dealing with the same old calculus table from the other examples. If you are the calculus professor and want to know how many students with names starting with ‘Em’ are in your classes, you could use a regular expression to filter them like so.

SELECT COUNT(*)
FROM students_table
WHERE name REGEXP ‘^Em’;

7. JOIN

If your information happened to be splitted into different tables, maybe you should consider combining them before making a query. In that case you can use the JOIN clause in order to combine different tables according to a common column that will be the key to the operation. Imagine we have two tables, one representing students data and another one with the calculus class data. The tables are as follows:

class_data : student_id, student_grade, semester, year

student_data: student_name, student_id, age, admission_year, course

Say we want to list the names of the students that started the class this semester. We would have to retrieve the students’ names and id from the student_data table, and the semester from the class_data table.

When it comes to joining tables, it comes in four different flavors: INNER JOIN, LEFT JOIN,RIGHT JOIN, and FULL JOIN.

INNER JOIN: Only the data that are referenced in both tables is joined, the data that misses a reference in either table is not part of the join table.

LEFT JOIN: Data from the left table is preserved and only matching data from the right table is kept. If a record from the left table does not have data from the other table its values will remain as null.

RIGHT JOIN: Left join applied in the reversed order to the tables.

FULL JOIN: All data from both tables are kept, with null replacing the data that doesn’t have matching records from the other table.

Back to our calculus class example, we could use a INNER JOIN since we are only interested in the students that are assigned to the calculus class:

SELECT students.student_id, students.student_name, class.student_id, class.semester
FROM students
INNER JOIN students.student_id = class.student_id
WHERE class.semester = 2;

Learn

W3 Schools

A broader guide towards the SQL statements, along with exercises. I would only remind that pure SQL statements are usually adapted as they are implemented by different dialects, so if you are trying to learn it given an application specific context, you should try to learn the concepts directly from the dialect documentation (i.e. if you are using MySQL at work, you should refer to https://dev.mysql.com/doc/)

Select SQL Star

If you are looking for a quick walkthrough of the most used SQL statements and concepts, the Select SQL Star is the perfect starting point. Even if you already know some SQL, it is useful to review some topics and also try to answer the queries on the website.

Practice

HackerRank

A good website to train your SQL knowledge with exercises divided by level of difficulty. The tool provides the output of your query and also performs some tests to assess if your code is correct. There are also other types of problems, not only for practicing database queries, but also problem solving in general.

Leetcode

Similar to HackerRank, but I personally find the queries on this one to be more challenging, even the easy ones. So if you are up to a challenge, you should definitely try this one.

--

--