Percona Toolkit | Altering large MySQL tables with no locks Ultimate Guide 2021 Update
Background
Imagine if you want to change the table schema in MySQL, the first thing that comes to your mind is the ALTER TABLE
statement. Well, you may not think it is going to be much work because you are just changing the type of one of the fields. So you confidently press the enter key. However, the table has an extremely large volume of data. Unsurprisingly, what follows is usually big trouble, especially in the production environment.
Even if you are using a database server with high performance, your master writer will be locked and unable to write for hours or even days depending on the number of records, field type, index of the table and so on.
This is undoubtedly fatal for any application in the production environment. Even if you have replication slaves to handle the reading task. Once the application needs to write, it will not be able to do so.
Then you suddenly realise what happened and are forced to kill the process but what’s worse, the query may have entered the replication log, then you will have to deal with it on every slave instance.
So what can be done to save you from your predicament? The following tool should be your first choice.
Introduction
Percona Toolkit is a set of advanced command-line tools for managing MySQL and system tasks, including:
- Verify MySQL replication integrity by checking source and replica data consistency
- Efficiently archive rows
- Find duplicate indexes
- Summarize MySQL and MongoDB servers
- Analyze MySQL queries from logs and tcpdump
- Analyze MongoDB query profiler
- Collect vital system information when problems occur
- And much, much more!
Cite: percona-toolkit
Today we are going to focus on its ability to ALTER tables without locking them.
For MySQL, Data Definition Language (DDL) is obviously a feature that makes all MySQL DBAs complain. Because in MySQL, when performing DDL on a table, it will get locked. When the table is small, say less than 10K records, it will have little impact on the application. When a table encounters ten million records, it will stop the application from writing the table.
Before MySQL 5.1, altering a table was very time-consuming. With the advent of InnoDB plugins, even though it makes altering indexes faster, altering table fields is still a problem. With online DDL in MySQL 5.6, ALTER TABLE
queries will no longer block writes to a table that is being changed.
Currently, the InnoDB engine does ALTER TABLE
through the following steps:
- Create a new invisible temporary table (tmp_table) with the table schema and DDL statement of the original table (original_table).
- Apply a write-lock on the original table. Block all INSERT, DELETE, UPDATE operations.
- Runs
insert into tmp_table select * from original_table
- Rename
original_table
andtmp_table
then droporiginal_table
- Release write lock.
As we can see that when executing DDL with InnoDB, the original table can only be read but not written. To solve this, Percona has introduced a tool, pt-online-schema-change
, which ensures that no read and write blocking occurs during table altering.
How pt-online-schema-change works?
Here’s how it changes the table schema:
- If a foreign key exists, it checks the table associated with the foreign key (according to the value of the
--alter-foreign-keys-method
option), and set the table accordingly. The tool does not execute without specifying a specific value with--alter-foreign-keys-method
. - Create a new empty table named
_+the original table name+_new
. - Update the schema of the new table based on the ALTER statement.
- Create triggers to record any writing operations of the original table during the altering period, and perform these operations after that to ensure no data will be lost. (The tool will not work if triggers are already defined in the table.)
- Copy data from the source table to the new table.
- Modify the table associated with the foreign key.
- Rename the source table, then rename the new table to the name of original table. The tables’ renaming is atomic which happen simultaneously by
RENAME TABLE dbname.table_name TO dbname._table_name_old, dbname._table_name_new TO dbname.table_name)
- Drop the old table and the trigger.
Examples of how to use pt-online-schema-change
The schema of the table:
CREATE TABLE `test_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Adding a field:
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "ADD COLUMN content text" D=test_db,t=test_table --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute
Removing a field:
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "DROP COLUMN content " D=test_db,t=test_table --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute
Altering a field:
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "MODIFY COLUMN name varchar(255) NOT NULL DEFAULT 0" D=test_db,t=test_table --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute
Renaming a column:
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "CHANGE COLUMN name title varchar(255)" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execut
Adding an index:
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "ADD INDEX idx_title(title)" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute
Removing an index:
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "DROP INDEX idx_title" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute
Some handy options to mention:
--no-drop-old-table
:
By adding this option, it will keep the original table after but with a new name_original_table_name_old
if the original table's name isoriginal_table_name
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1 --alter "ADD COLUMN new_field text" D=test_db,t=test_table --no-check-replication-filters --no-drop-old-table --print --execute
--alter-foreign-keys-method
:
This option defines how to modify foreign keys, so they reference the new table. Foreign keys that reference the table to be altered must be treated specially to ensure that they continue to reference the correct table. When the tool renames the original table to let the new one take its place, the foreign keys “follow” the renamed table, and must be changed to reference the new table instead.
For reliability reasons, use therebuild_constraints
method whenever possible. Or if there is no reliability requirement,auto
can be used.--max-load
:
If you are working on the production environment, and you don’t want to affect the production application, you can run the tool with the--max-load
argument.
The default is THREADS_RUNNING=25, which means that if more than 25 threads are running, percona will pause copying the data and continue when the running threads decrease below 25.
Example:
pt-online-schema-change --user=root --password=123456 --host=127.0.0.1 --alter "add INDEX idx_title(title)" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --max-load=Threads_running=20 --statistics --print --execute# When the task is paused, it will display:
# Pausing because Threads_running=20
--critical-load
:
This is an important option.
Percona will kill itself if there’s much load. Percona runs aSHOW GLOBAL STATUS
query in the DB after every chunk which checks the property Threads_running. The critical load Threads_running is 50 by default, so if you don't explicitly set--max-load
or--critical-load
, when the DB instance Threads_running reaches50
, the percona process will be cancelled immediately.
In addition to the options above, there are some more. Please refer to the official documentation for further information.
Some other tips:
- The premise for using this tool is that the table you are working with needs to have a primary key or a unique index.
- Altering a production table always comes with risks, it is recommended to back up the data before any modification.
- You would better use
--dry-run --print
before doing--execute
to make sure you are doing the right thing. - When dealing with tables with foreign keys, you need to set the
--alter-foreign-keys-method
option according to the situation. - Percona will wait for a time when no data is written to perform table renaming as the last step before finishing the task. Therefore, you should try your best to choose a time when the business is relatively idle, at least when the target table has low writing operations.