Percona Toolkit | Altering large MySQL tables with no locks Ultimate Guide 2021 Update

David Shen
CreditorWatch
Published in
6 min readMar 1, 2021

Background

Imagine if you want to change the table schema in MySQL, the first thing that comes to your mind is the ALTER TABLE statement. Well, you may not think it is going to be much work because you are just changing the type of one of the fields. So you confidently press the enter key. However, the table has an extremely large volume of data. Unsurprisingly, what follows is usually big trouble, especially in the production environment.

Even if you are using a database server with high performance, your master writer will be locked and unable to write for hours or even days depending on the number of records, field type, index of the table and so on.

This is undoubtedly fatal for any application in the production environment. Even if you have replication slaves to handle the reading task. Once the application needs to write, it will not be able to do so.

Then you suddenly realise what happened and are forced to kill the process but what’s worse, the query may have entered the replication log, then you will have to deal with it on every slave instance.

So what can be done to save you from your predicament? The following tool should be your first choice.

Introduction

Percona Toolkit is a set of advanced command-line tools for managing MySQL and system tasks, including:

  • Verify MySQL replication integrity by checking source and replica data consistency
  • Efficiently archive rows
  • Find duplicate indexes
  • Summarize MySQL and MongoDB servers
  • Analyze MySQL queries from logs and tcpdump
  • Analyze MongoDB query profiler
  • Collect vital system information when problems occur
  • And much, much more!

Cite: percona-toolkit

Today we are going to focus on its ability to ALTER tables without locking them.

For MySQL, Data Definition Language (DDL) is obviously a feature that makes all MySQL DBAs complain. Because in MySQL, when performing DDL on a table, it will get locked. When the table is small, say less than 10K records, it will have little impact on the application. When a table encounters ten million records, it will stop the application from writing the table.

Before MySQL 5.1, altering a table was very time-consuming. With the advent of InnoDB plugins, even though it makes altering indexes faster, altering table fields is still a problem. With online DDL in MySQL 5.6, ALTER TABLE queries will no longer block writes to a table that is being changed.

Currently, the InnoDB engine does ALTER TABLE through the following steps:

  1. Create a new invisible temporary table (tmp_table) with the table schema and DDL statement of the original table (original_table).
  2. Apply a write-lock on the original table. Block all INSERT, DELETE, UPDATE operations.
  3. Runs insert into tmp_table select * from original_table
  4. Rename original_table and tmp_table then drop original_table
  5. Release write lock.

As we can see that when executing DDL with InnoDB, the original table can only be read but not written. To solve this, Percona has introduced a tool, pt-online-schema-change, which ensures that no read and write blocking occurs during table altering.

How pt-online-schema-change works?

Here’s how it changes the table schema:

  1. If a foreign key exists, it checks the table associated with the foreign key (according to the value of the --alter-foreign-keys-method option), and set the table accordingly. The tool does not execute without specifying a specific value with --alter-foreign-keys-method.
  2. Create a new empty table named _+the original table name+_new.
  3. Update the schema of the new table based on the ALTER statement.
  4. Create triggers to record any writing operations of the original table during the altering period, and perform these operations after that to ensure no data will be lost. (The tool will not work if triggers are already defined in the table.)
  5. Copy data from the source table to the new table.
  6. Modify the table associated with the foreign key.
  7. Rename the source table, then rename the new table to the name of original table. The tables’ renaming is atomic which happen simultaneously by RENAME TABLE dbname.table_name TO dbname._table_name_old, dbname._table_name_new TO dbname.table_name)
  8. Drop the old table and the trigger.

Examples of how to use pt-online-schema-change

The schema of the table:

CREATE TABLE `test_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Adding a field:

pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "ADD COLUMN content text" D=test_db,t=test_table --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute

Removing a field:

pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "DROP COLUMN content " D=test_db,t=test_table --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute

Altering a field:

pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "MODIFY COLUMN name varchar(255) NOT NULL DEFAULT 0" D=test_db,t=test_table --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute

Renaming a column:

pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "CHANGE COLUMN name title varchar(255)" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execut

Adding an index:

pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "ADD INDEX idx_title(title)" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute

Removing an index:

pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "DROP INDEX idx_title" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute

Some handy options to mention:

  • --no-drop-old-table:
    By adding this option, it will keep the original table after but with a new name _original_table_name_old if the original table's name is original_table_name
pt-online-schema-change --user=YourDBUser --password=YourPassword --host=127.0.0.1  --alter "ADD COLUMN new_field text" D=test_db,t=test_table --no-check-replication-filters --no-drop-old-table --print --execute
  • --alter-foreign-keys-method:
    This option defines how to modify foreign keys, so they reference the new table. Foreign keys that reference the table to be altered must be treated specially to ensure that they continue to reference the correct table. When the tool renames the original table to let the new one take its place, the foreign keys “follow” the renamed table, and must be changed to reference the new table instead.
    For reliability reasons, use the rebuild_constraints method whenever possible. Or if there is no reliability requirement, auto can be used.
  • --max-load:
    If you are working on the production environment, and you don’t want to affect the production application, you can run the tool with the --max-load argument.
    The default is THREADS_RUNNING=25, which means that if more than 25 threads are running, percona will pause copying the data and continue when the running threads decrease below 25.
    Example:
pt-online-schema-change --user=root --password=123456 --host=127.0.0.1  --alter "add INDEX idx_title(title)" D=test_db,t=test_table --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --max-load=Threads_running=20 --statistics --print --execute# When the task is paused, it will display:
# Pausing because Threads_running=20
  • --critical-load:
    This is an important option.
    Percona will kill itself if there’s much load. Percona runs a SHOW GLOBAL STATUS query in the DB after every chunk which checks the property Threads_running. The critical load Threads_running is 50 by default, so if you don't explicitly set --max-load or --critical-load, when the DB instance Threads_running reaches 50, the percona process will be cancelled immediately.
    In addition to the options above, there are some more. Please refer to the official documentation for further information.

Some other tips:

  • The premise for using this tool is that the table you are working with needs to have a primary key or a unique index.
  • Altering a production table always comes with risks, it is recommended to back up the data before any modification.
  • You would better use --dry-run --print before doing --execute to make sure you are doing the right thing.
  • When dealing with tables with foreign keys, you need to set the --alter-foreign-keys-method option according to the situation.
  • Percona will wait for a time when no data is written to perform table renaming as the last step before finishing the task. Therefore, you should try your best to choose a time when the business is relatively idle, at least when the target table has low writing operations.

--

--

CreditorWatch
CreditorWatch

Published in CreditorWatch

We’re Australia’s most innovative commercial credit reporting bureau, helping over 50,000 businesses avoid bad debt. Since we began, CreditorWatch has been helping customers who range from sole traders through to ASX-listed companies, to better understand who their debtors are.

Responses (1)