Advanced Query Techniques in MySQL

Explore advanced query techniques to craft powerful, efficient SQL queries that tackle even your trickiest data challenges.

Auriga Aristo
Indonesian Developer
7 min readMay 4, 2024

--

Photo by Rubaitul Azad on Unsplash

Imagine we have extensive data in tables, and our boss says we must manipulate data. How can we do it? What if our boss asked us to summarize some information? We may find it simple. But what if the request is the way that spreadsheets can’t handle? That’s how MySQL comes as the solution.

MySQL is a popular open-source relational database system renowned for its efficiency in handling complex queries. It’s widely used for managing large volumes of data, and its capabilities are unparalleled. We can confidently manage even the most intricate data sets with the proper understanding.

This article will equip you with practical tools to optimize your queries. We’ll demystify concepts that seem intimidating, like subqueries, and delve into the practical application of different join types and the magic of indexing.

If you’re ready to take your database skills to the next level, get ready! By the end, you’ll transform yourself into a MySQL maestro, solving data challenges with the confidence of a seasoned developer.

The Power of Subqueries

What is subqueries? A subquery is a query nested within another query (SELECT, INSERT, UPDATE, or DELETE). It operates as a tool to create more complex expressions within your main query.

Imagine the scenario

You are coming to a restaurant and wanting to order food. You will see the main menu (your main query) representing the whole database table, containing a list of all dishes offered by the restaurant. But you have dietary restrictions (filtering with subquery); let’s say you’re allergic to nuts. You need a list of entirely nut-free dishes or a separate list of ingredients the chef has on hand, such as which dishes use which ingredients.

How the Subquery Helps

First, we will create a subquery that examines the ingredients list and finds all dishes that contain “nuts.”

SELECT dish_name 
FROM ingredients_list
WHERE ingredient = 'nuts'

Then, we search for the dish’s names where the food’s name differs from those listed in the subquery.

SELECT dish_name
FROM main_menu
WHERE dish_name NOT IN (
SELECT dish_name
FROM ingredients_list
WHERE ingredients = 'nuts'
)

Based on the SQL above, you will get a tailored menu listing only the safe dishes filtered based on the subquery results.

Additional points to consider

  • Subqueries can be nested: Imagine having a subquery within another subquery, like a mini-menu within a mini-menu. While powerful, it’s best to keep things simple when starting.
  • Subqueries can be used with different SQL statements: They’re not limited to SELECT statements but can also be used with INSERT, UPDATE, and DELETE statements.

Mastering Complex Joins

Joins are fundamental in working with relational databases, but they can get complex sometimes, especially when you move beyond the basics.

Imagine a school library

Think of a library card catalog as two separate tables in your database. One table holds information about students (student ID, name, grade level) and a table that track book loans (student ID, book ID, loan date).

Inner Join

An inner join returns records with matching values in both tables. It only includes rows where the columns in both tables match.

Suppose you want to find books that are currently on loan. By joining the students table and the book loans table, they would give you a list of students with borrowed books and the corresponding book information.

SELECT *
FROM student
INNER JOIN book_loans ON student.student_id = book_loans.student_id;

Left Join (or Left Outer Join)

A left join returns all records from the left table (the first table specified) and the matched records from the right table, if any. If there is no match, the result is NULL on the right side.

Suppose you want to see a list of all students and the books they have borrowed (if any). By joining the students table and the book loans table, they would give you a list of all students, including those who haven’t borrowed any books.

SELECT *
FROM student
LEFT JOIN book_loans ON student.student_id = book_loans.student_id;

Right Join (or Right Outer Join)

A right join is similar to a left join but returns all records from the right table and the matched records from the left table, if any. The result is NULL on the left side if there is no match.

Imagine you want to see a list of all the books and the students who have borrowed them. By joining the students table and the book loans table, they would give you a list of all books, including those that haven’t been borrowed.

SELECT *
FROM student
RIGHT JOIN book_loans ON student.student_id = book_loans.student_id;

Outer Join (or Full Outer Join)

An outer join returns all records when there is a match in either left or right table records. If there is no match, the result is NULL on both sides.

Imagine you want to see a list of all students and all books, with information on who borrowed which book (if applicable). By joining the students table and the book loans table, you would get a comprehensive list of all students and all books.

SELECT *
FROM student
FULL OUTER JOIN book_loans ON student.student_id = book_loans.student_id;

Self Join

A self-join is a join in which a table is joined with itself. It can be helpful when you have hierarchical data or need to compare rows within the same table.

Suppose you want to find students borrowing books from the same grade level. Students are matched based on their grade level by performing a self-joint on the students table.

SELECT a.*, b.*
FROM student a, student b
WHERE a.grade_level = b.grade_level
AND a.student_id != b.student_id;

Additional Tips

  • Visual aids like Venn diagrams can help understand how different joins work.
  • Practice writing basic join queries with sample datasets to solidify your understanding.
  • As you get more comfortable, explore multi-table joins that combine data from more tables for complex analyses.

Indices: The Key to Speedy Queries

Imagine you are reading a nonfiction book. You will open the table of contents page when you want to find specific topics and relevant pages. Using it, you can search for anything without flipping through the entire book. Similarly, a database has an index that offers a fast lookup for data without having to scan the entire table.

In technical definition, an index is a separate data structure stored alongside your table. It holds sorted copies of a specific column and pointers back to the corresponding rows in your main table.

Key Consideration

  • Columns frequently used in WHERE clauses: These are prime candidates for indexing.
  • Columns used for sorting: Indices can optimize ORDER BY operations.
  • Size of the index: Large indices consume more space and can slightly impact insertion speed.
  • Number of indices per table: Too many indices can slow down updates and inserts.

How to create

CREATE INDEX index_name ON table_name (column_to_index); 

Implementation

Let’s say you have a large employees table with a salary column. Without an index, a query like this might take a while:

SELECT * FROM employees WHERE salary > 75000;

The same query could execute significantly faster by creating an index on the salary column.

Important Notes

  • Indices are only sometimes beneficial: They can slightly slow down write operations (INSERT, UPDATE, DELETE) as they must be updated alongside your table. Use them strategically for columns frequently used in search conditions and sorting.
  • The EXPLAIN statement: Most database systems have an EXPLAIN statement that can help analyze query plans and let you see if indices are being used effectively.

Additional Optimization Considerations

Use Explain statement

The EXPLAIN statement provides insights into how MySQL plans to execute your query. It will reveal which indexes (if any) were used, how tables are joined, and the estimated cost of different steps.

Use Temporary Tables

Temporary tables can be valuable in your optimization toolbox but require strategic use. These tables are created on the fly to hold intermediate results, which can be helpful for breaking down complex queries into smaller and more manageable steps or storing pre-calculated results that are used multiple times in a query.

However, overusing temporary tables can lead to slower performance. In some cases, wringing the query might be a better approach.

Query Caching

MySQL has a built-in query cache that stores the results of frequently used queries. This cache acts as a shortcut, allowing the database to retrieve results instantly from the cache instead of re-executing the entire query.

Query caching can significantly improve performance for repetitive queries, but it requires careful management as cached results might become outdated if the underlying data changes.

Denormalization: A Calculated Trade-Off

Denormalization is a technique for strategically introducing controlled redundancy in your database schema. It might sound counterintuitive, but it can improve query performance. Denormalization involves duplicating certain data points across multiple tables to avoid complex joins.

While this approach can make queries faster, it comes at the cost of slightly more complex data updates (as you need to update the duplicated data in multiple places). Denormalization should be considered cautiously, weighing the performance benefits against the potential drawbacks for data integrity.

Upgrading Hardware

While optimizing your queries is essential, the underlying hardware also affects performance. Upgrading your database server’s RAM, storage, and CPU can significantly boost performance, especially for large datasets or complex queries.

However, it’s important to remember that optimizing your queries is often the most cost-effective way to achieve better performance. Start by optimizing your queries, and then consider hardware upgrades as a complementary strategy if needed.

Conclusion

Mastering advanced SQL query techniques is an ongoing journey. By understanding subqueries, complex joins, and strategic indexing, you’ve unlocked powerful tools for solving real-world data challenges.

This article provided a foundation for tackling complex SQL queries, but there’s always more to explore! Continue experimenting, referencing the MySQL documentation, and seeking examples to refine your skills.

Now armed with advanced querying techniques, you can approach data challenges with a greater sense of confidence. Whether you’re building insightful reports, optimizing application performance, or simply satisfying your curiosity about data patterns, you have the skills to extract meaningful insights from your MySQL databases.

Happy coding~~

--

--

Auriga Aristo
Indonesian Developer

4+ years in Backend Developer | PHP, Java/Kotlin, MySQL, Golang | New story every week