How to Improve MYSQL / MariaDB Database Performance

Filipe Pires
Jan 20 · 5 min read
Indexing — Guide to improve performance, image by João Pires

In the follow up to my other article: How to Improve Microsoft SQL Server (MSSQL) Database Performance, now I will do the same but for MYSQL / MariaDB databases.

When we develop a custom web application, we must test all types of scenarios. Normally the scaling of the database is a key point for quality software. In my current job, I inherited a couple of applications that suffer in performance because the previous developer did not work well at the database level. In the web ecosystem, every millisecond gained in database operations is an essential step for delivering a better user experience on the software we make.

Let’s start our journey!

The initial starting point is knowing the importance of choosing the best storage engine between MyISAM and InnoDB is a pivotal step, in this article I don’t want to focus on the difference between these two engine types, but I leave here a link to know more about this subject.

summary about the storage engine, I tend to lean to the InnoDB for all my projects, I only use MyISAM if the table is not often written. You could have a database with a mix of InnoDB and MyISAM tables (i don’t use this too often, but it’s a possibility).

After choosing the storage engine, now is the time for talking about indexing the fields of the tables. Of course, when you create a primary key an automatic index is created… and I see a lot of projects that only work with these indexes. The problem is when the database grows to a larger number of rows/data, the time for executing SQL queries that filter data is heavily affected by the lack of database index planning.

Let’s use a test database I created for proof of concept, with two tables (authors and posts) full of dummy data, with a primary key only in the field (id) with auto_increment and InnoDB storage engine for each table, below is the structure of the tables:

Table authors with near 430000 rows with a size of 34.6 MB
Table posts with near 276000 rows with a size of 113.6 MB

Now let’s create a simple possible real-world SELECT, joining the two tables and filtering by this criteria :

  • Joining tables posts with authors by id
  • Filter by posts date between the year 1979 to 2021
  • Filter by authors last_name starting with ‘St’
  • Filter by author age between 18 and 40 years old

SQL TEST QUERY and the output

SELECT p.id, a.first_name, a.last_name, a.email,a.birthdate, p.title, p.description, p.content, p.dateFROM posts p
INNER JOIN authors a
ON p.author_id = a.id
WHERE year(p.date) BETWEEN 1979 AND 2021
AND a.last_name LIKE 'St%'
AND (YEAR(CURDATE()) - YEAR(birthdate)) BETWEEN 18 AND 40
These results were obtained using only the index of the primary key in both tables.

It took 126 ms to output 2952 rows of data.

Now we have a point to compare the optimization that will be applied in the rest of the article.

If you analyze the query used, let us focus on the WHERE clause: there are 3 fields that are being filtered ( p.date, a.last_name and a.birthdate ).

Note: In some projects that I inherit and work on, I still see the foreign key without indexes. This is a bad practice in database relations, causing a decresase in performance on the JOIN connection between tables.

My rule of thumb is to analyze these fields and choose the best candidates to have an index apply to them. Don’t abuse indexing fields (especially string types) because adding a lot of indexes could degrade performance, besides primary and foreign keys I try to use 1 to 3 index per table at the most and only if needed.

The best candidates are the fields that could have the chance of repeated data, for example, the posts date field and authors birthdate could have n rows with the same content date year ex: ‘1999-MM-DD’ versus the authors last_name field that too could have repeated values because of the nature of the content, but the field type is a string and I avoid adding indexes to string fields, just in last resort.

So let’s add indexes to the posts date and authors birthdate fields.

CREATE INDEX index_posts_date ON posts (date);
CREATE INDEX index_authors_birthdate ON authors (birthdate);

After creating the indexes run a simple SQL Command for optimizing the tables: OPTIMIZE TABLE table_name, you can also do more than one table in one line using a comma between table names.

OPTIMIZE TABLE authors, posts;

If you want to run this command on all your database tables, without manually type all the names of the tables, use the next SQL command to apply automatically to all tables of the desired database, just replace 'your_database' for the name of the desired database.

SELECT CONCAT('OPTIMIZE TABLE ',TABLE_NAME, ';') 
FROM INFORMATION_SCHEMA.TABLES
WHERE table_schema='your_database';

After running the optimize command, let’s now re-run our initial query and measure the results :

Huge increase in performance.

With this modification, our query took only 12.8 ms versus the initial 126ms!

It’s an improvement of 90% of speed compared with the first query, just adding indexes used on the WHERE clause.

Now the downside of this method: the increased size of the tables!

Let’s compare the table posts size before adding indexes: 113.6MB, and with the added indexes: 132.6 MB.
It’s an increase of 17.35% in disk space, so you have to know that indexes come with a price, the increased size gained on the table (the size varies between the number of records/indexes and field types/content).

In my experience, I rather want to gain in performance and accept that I need more disk size for the database…

Note: all of the SQL commands showed in here are perfectly safe for your data, and if you want to roll back to the previous state is easy: just delete the added indexes and re-run the optimize table commands and all your table size is back to normal, no data lost… only speed.

Well, I hope it was a good reading, and I will be happy if this helps someone.

Nerd For Tech

From Confusion to Clarification

Filipe Pires

Written by

Dad, Senior Software Development Engineer for Enterprise Applications, Game Developer/Collector/Player, Electronic development/prototyping

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/. Don’t forget to check out Ask-NFT, a mentorship ecosystem we’ve started

Filipe Pires

Written by

Dad, Senior Software Development Engineer for Enterprise Applications, Game Developer/Collector/Player, Electronic development/prototyping

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/. Don’t forget to check out Ask-NFT, a mentorship ecosystem we’ve started

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store