12 Tips for Optimizing SQL Queries for Faster Performance

Sarang S. Babu
Learning SQL
Published in
8 min readMar 6, 2023
Image by Author

Ways to Optimize SQL Queries

Database applications have become a crucial component of many enterprises in todays data-driven world. With so many companies choosing to process and store their data in the cloud, optimizing queries has become more important than ever for a company’s bottom line.

We will examine some effective techniques for accelerating SQL query performance in this article. There are several ways to optimize SQL queries for faster performance which are discussed below.

1. Minimize the use of wildcard characters

The use of wildcard characters, such as % and _, in SQL queries, can slow down query performance. When using wildcard characters, the database has to scan the entire table to find the relevant data. To optimize SQL queries, it is important to minimize the use of wildcard characters and to use them only when absolutely necessary.

Let’s consider a query to locate all clients whose last name of the city begins with the letter “P”, for instance. The following query uses a wildcard character to find all matching records:

SELECT * FROM customers WHERE last_name_city LIKE 'P%';

This query will work, but it will be slower than a query that uses an index on the last_name_city column. The query can be improved by adding an index to the last_name_city column and rewriting it as follows:

SELECT * FROM customers WHERE last_name_city >= 'P' AND last_name < 'Q';

This query will use the index on the last name column and will be faster than the previous query.

2. Increase Query Performance with Indexes

SQL queries can be sped up by using indexes, which enable the database to quickly find entries that fit specific criteria. Indexing is the process of mapping the values of one or more columns from a table to such a unique value that makes it easy to search for the rows that match a certain value or range of values.

To enhance SQL queries, you can create indexes on the columns that are frequently used in the WHERE, JOIN, and ORDER BY clauses. However, creating too many indexes can slow down data modification operations such as INSERT, UPDATE, and DELETE.

Consider the trade-offs between read performance and write performance when deciding which columns to index and which types of indexes to use.

Use the following query to find all orders made by a specific customer:

SELECT * FROM orders WHERE customer_number = 2154;

Because the database must search the entire table for the entries that match the customer number, this query may take a long time if the orders table contains a lot of records. You can make an index on the customer_number column to improve the query:

CREATE INDEX idx_orders_customer_number ON orders (customer_id);

This creates an index on the customer_number column of the orders table. Now when you run the query, the database can quickly locate the rows that match the customer number using the index, which can improve query performance.

3. Use appropriate data types

Using appropriate data types for columns in a database can significantly improve query performance. For example, using an integer data type for a column that contains numeric values can make queries run faster than using a text data type. Using the correct data type also ensures data integrity and can prevent data conversion errors.

Let’s consider, we have a table that where each row represents the orders’ details of a retail store. The table has columns for order ID, customer ID, order date, and order total.

The order total column contains numeric values. If the order total column is stored as a text data type, queries that perform calculations on the order total will be slower than if the column was stored as a numeric data type.

4. Avoid subqueries

Subqueries can slow down query performance, especially when used in the WHERE or HAVING clauses. It is important to avoid subqueries whenever possible and to use JOINs or other techniques instead.

For example, consider a query that finds all customers who have placed an order in the last 30 days. The following query uses a subquery to find all order IDs within the last 30 days:

SELECT * FROM customers WHERE customer_id IN (SELECT customer_id FROM orders WHERE order_date >= DATEADD(day, -30, GETDATE()));

This query will work, but it will be slower than a query that uses a JOIN to find the relevant data. The following query uses a JOIN to find all customers who have placed an order in the last 30 days:

SELECT DISTINCT c.* FROM customers c JOIN orders o ON c.customer_id = o.customer_id WHERE o.order_date >= DATEADD(day, -30, GETDATE());

This query joins the customers table with the orders table and retrieves all customer information for those who have placed an order in the last 30 days. This query will be faster than the previous query because it avoids the use of a subquery.

5. Use LIMIT or TOP to limit the number of rows returned

The LIMIT or TOP clause must be used to restrict the number of rows returned in SQL queries. There will be fewer data to process and return as a result.

For example, consider a query to find all customers who have placed an order in the last 27 days. If there are a large number of customers who have placed orders in the last 27 days, the query could return a large number of rows. This can be optimized using LIMIT or TOP. The following query limits the number of rows returned to 10:

SELECT TOP 10 * FROM customers WHERE customer_id IN (SELECT customer_id FROM orders WHERE order_date >= DATEADD(day, -27, GETDATE()));

This query will only return the top 10 rows that match the criteria, which will improve query performance.

6. Avoid using SELECT *

Using the SELECT * statement can slow down query performance because it returns all columns in a table, including those that are not needed for the query. To optimize SQL queries, it is important to only select the columns that are needed for the query.

For example, consider a query to find all customers who have placed an order in the last 30 days. The following query selects all columns from the customers table:

SELECT * FROM customers WHERE customer_id IN (SELECT customer_id FROM orders WHERE order_date >= DATEADD(day, -30, GETDATE()));

To optimize the query, the SELECT statement can be modified to only select the columns that are needed:

SELECT customer_id, first_name, last_name FROM customers WHERE customer_id IN (SELECT customer_id FROM orders WHERE order_date >= DATEADD(day, -30, GETDATE()));

This query will only select the customer ID, first name, and last name columns, which will improve query performance.

7. Use EXISTS instead of IN

A value is compared with a list of values returned by a subquery using the IN operator. However, using IN can slow down query performance because it requires the database to perform a full table scan on the subquery. To optimize SQL queries, you can use the EXISTS operator instead of IN.

For example, consider a query to find all customers who have placed an order in the last 30 days:

SELECT * FROM customers WHERE customer_id IN (SELECT customer_id FROM orders WHERE order_date >= DATEADD(day, -30, GETDATE()));

This query uses IN to compare the customer ID with the list of customer IDs returned by the subquery. To optimize the query, you can use EXISTS instead of IN:

SELECT * FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id AND o.order_date >= DATEADD(day, -30, GETDATE()));

This query uses EXISTS to check if a matching row exists in the orders table instead of using IN. This can improve query performance by avoiding a full table scan.

8. Use GROUP BY to group data

It is used to group rows based on one or more columns. This can be useful for summarizing data or performing aggregate functions on groups of data. However, using GROUP BY can slow down query performance if it is used unnecessarily. To optimize SQL queries, you should only use GROUP BY when it is necessary.

For example, consider a query to find the total number of orders placed by each customer:

SELECT customer_id, COUNT(*) as order_count FROM orders GROUP BY customer_id;

This query uses GROUP BY to group the rows by customer ID and count the number of orders placed by each customer. To optimize the query, you can use a subquery to retrieve the customer information and join it with the orders table:

SELECT c.customer_id, c.first_name, c.last_name, o.order_count FROM customers c JOIN (SELECT customer_id, COUNT(*) as order_count FROM orders GROUP BY customer_id) o ON c.customer_id = o.customer_id;

This query uses a subquery to calculate the number of orders placed by each customer and then joins the result with the customers table to retrieve the customer information. This avoids the use of GROUP BY and can improve query performance

9. Use stored procedures

Stored procedures are precompiled SQL statements that are stored in the database. They can be called from an application or directly from a SQL query. Using stored procedures can improve query performance by reducing the amount of data that is sent between the database and the application, and by reducing the amount of time required to compile and execute the SQL statements.

10. Optimize the database design

Optimizing the database design can also improve query performance. This includes ensuring that tables are properly normalized and that indexes are used effectively. In addition, it is important to ensure that the database is properly tuned for the expected workload and that it is configured for the appropriate level of concurrency.

11. Use query optimization tools

There are a variety of query optimization tools available that can help identify performance issues in SQL queries. These tools can provide recommendations for improving query performance, such as creating indexes, rewriting queries, or optimizing the database design. Some popular query optimization tools include Microsoft SQL Server Query Optimizer, Oracle SQL Developer, and MySQL Query Optimizer.

12. Monitor query performance

Monitoring query performance is an important step in optimizing SQL queries. By monitoring query performance, it is possible to identify performance issues and make appropriate adjustments. This can include optimizing indexes, rewriting queries, or adjusting the database design. For tracking query performance, a number of tools are available, including SQL Server Profiler, Oracle Enterprise Manager, and MySQL Enterprise Monitor.

Conclusion

Optimizing SQL queries for faster performance is an important step in ensuring that database applications run efficiently. Through this article, we can conclude the following points -

  1. Indexing is the most efficient technique to increase the performance of SQL queries but carefully consider the trade-offs between read performance and write performance when deciding which columns to index and which types of indexes to use.
  2. Optimizing SQL queries is an ongoing process and requires regular monitoring and adjustment to ensure continued performance improvements.
  3. Have to minimize the use of expensive operations such as JOIN, GROUP BY, IN, and subqueries, to increase the performance.
  4. Test queries on realistic data sets to ensure that optimizations are having the desired effect.

References

  1. Sarang S, What are SQL Wildcard Operators? (2022), Plumbers of Data Science
  2. Dionysia Lemonaki, Learn SQL Queries — Database Query Tutorial for Beginners 2021, freecodecamp
  3. Himnshu Yadav, Limit in SQL, Scaler Topics
  4. Lec-57: SQL Queries and Subqueries, Gate Smashers

Related10 Best YouTube Channels to Learn SQL

Go from SELECT * to interview-worthy project. Get our free 5-page guide.

--

--

Sarang S. Babu
Learning SQL

A tech enthusiast with a great taste in technology, avid gamer and a marketer by profession. 😎