How to optimally alter mysql tables using percona-tools

Nov 19, 2020

Introduction

You have to be very carefull when you are going to run an ALTER Table in the production Database. MySQL’s ALTER TABLE can become a serious problem especially with very large tables. Many people have experience with ALTER TABLE operations that have taken hours or days to complete.

From the MySQL documentation for ALTER TABLE:

ALTER TABLE changes the structure of a table. The operations are processed using one of the following algorithms:

COPY: Operations are performed on a copy of the original table, and table data is copied from the original table to the new table row by row. Concurrent DML is not permitted.
INPLACE: Operations avoid copying table data but may rebuild the table in place. An exclusive metadata lock on the table may be taken briefly during preparation and execution phases of the operation. Typically, concurrent DML is supported.
INSTANT: Operations only modify metadata in the data dictionary. No exclusive metadata locks are taken on the table during preparation and execution, and table data is unaffected, making operations instantaneous. Concurrent DML is permitted. (Introduced in MySQL 8.0.12)

The ALGORITHM clause is optional. If the ALGORITHM clause is omitted, MySQL uses ALGORITHM=INSTANT for storage engines and ALTER TABLE clauses that support it. Otherwise, ALGORITHM=INPLACE is used. If ALGORITHM=INPLACE is not supported, ALGORITHM=COPY is used.

ALTER TABLE operations that use the COPY algorithm wait for other operations that are modifying the table to complete. After alterations are applied to the table copy, data is copied over, the original table is deleted, and the table copy is renamed to the name of the original table. While the ALTER TABLE operation executes, the original table is readable by other sessions (with the exception noted shortly). Updates and writes to the table started after the ALTER TABLE operation begins are stalled until the new table is ready, then are automatically redirected to the new table. The temporary copy of the table is created in the database directory of the original table unless it is a RENAME TO operation that moves the table to a database that resides in a different directory.

The exception referred to earlier is that ALTER TABLE blocks reads (not just writes) at the point where it is ready to clear outdated table structures from the table and table definition caches. At this point, it must acquire an exclusive lock. To do so, it waits for current readers to finish, and blocks new reads and writes.

How to run ALTER TABLE on your largest tables without downtime?

1. MySQL DDL Feature

MySQL 5.6 added the Online DDL to InnoDB which provides support for in-place table alterations and concurrent DML. Benefits of this feature include:

Improved responsiveness and availability in busy production environments, where making a table unavailable for minutes or hours is not practical.
The ability to adjust the balance between performance and concurrency during DDL operations using the LOCK clause. See The LOCK clause.
Less disk space usage and I/O overhead than the table-copy method.

You can control aspects of a DDL operation using the ALGORITHM and LOCK clauses of the ALTER TABLE statement. These clauses are placed at the end of the statement, separated from the table and column specifications by commas. For example:

ALTER TABLE tbl_name ADD PRIMARY KEY (column), ALGORITHM=INPLACE, LOCK=NONE;

The LOCK clause is useful for fine-tuning the degree of concurrent access to the table. The ALGORITHM clause is primarily intended for performance comparisons and as a fallback to the older table-copying behavior in case you encounter any issues. For example:

To avoid accidentally making the table unavailable for reads, writes, or both, specify a clause on the ALTER TABLE statement such as LOCK=NONE (permit reads and writes) or LOCK=SHARED (permit reads). The operation halts immediately if the requested level of concurrency is not available.
To compare performance, run a statement with ALGORITHM=INPLACE and ALGORITHM=COPY. Alternatively, run a statement with the old_alter_table configuration option disabled and enabled.
To avoid tying up the server with an ALTER TABLE operation that copies the table, include ALGORITHM=INPLACE. The statement halts immediately if it cannot use the in-place mechanism.

For more take a look at the documentation:

https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.html

https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-operations.html

https://medium.com/@soomiq/why-you-should-not-use-mysql-5-6-online-ddl-on-aws-aurora-40985d5e90f5

2. PERCONA TOOLS

You need to download the latest percona toolkit or get the latest release from the command line:

wget percona.com/get/percona-toolkit.tar.gz

wget percona.com/get/percona-toolkit.rpm

wget percona.com/get/percona-toolkit.deb

pt-online-schema-change - ALTER tables without locking them.

pt-online-schema-change emulates the way that MySQL alters tables internally, but it works on a copy of the table you wish to alter. This means that the original table is not locked, and clients may continue to read and change data in it.

pt-online-schema-change works by

creating an empty copy of the table to alter, modifying it as desired,
copying rows from the original table into the new table.
When the copy is complete, it moves away the original table and replaces it with the new one.
By default, it also drops the original table.

Any modifications to data in the original tables during the copy will be reflected in the new table, because the tool creates triggers on the original table to update the corresponding rows in the new table. The use of triggers means that the tool will not work if any triggers are already defined on the table.

When the tool finishes copying data into the new table, it uses an atomic RENAME TABLE operation to simultaneously rename the original and new tables. After this is complete, the tool drops the original table.

Foreign keys complicate the tool’s operation and introduce additional risk. The technique of atomically renaming the original and new tables does not work when foreign keys refer to the table. The tool must update foreign keys to refer to the new table after the schema change is complete. The tool supports two methods for accomplishing this. You can read more about this in the documentation for –alter-foreign-keys-method.

Foreign keys also cause some side effects. The final table will have the same foreign keys and indexes as the original table (unless you specify differently in your ALTER statement), but the names of the objects may be changed slightly to avoid object name collisions in MySQL and InnoDB.

In most cases the tool will refuse to operate unless a PRIMARY KEY or UNIQUE INDEX is present in the table. See –alter for details.
The tool refuses to operate if it detects replication filters. See --[no]check-replication-filters for details.
The tool pauses the data copy operation if it observes any replicas that are delayed in replication. See –max-lag for details.
The tool pauses or aborts its operation if it detects too much load on the server. See –max-load and –critical-load for details.
The tool sets innodb_lock_wait_timeout=1 and (for MySQL 5.5 and newer) lock_wait_timeout=60 so that it is more likely to be the victim of any lock contention, and less likely to disrupt other transactions. These values can be changed by specifying –set-vars.
The tool refuses to alter the table if foreign key constraints reference it, unless you specify –alter-foreign-keys-method.

Requirements:

Database Permissions:

Process
Replication_slave

Dry Run:
=========================

PTDEBUG=1 pt-online-schema-change --progress=percentage,10 --alter='CHANGE COLUMN `trigger` `trigger` ENUM("a","b","c","d","e")' --alter-foreign-keys-method=auto h=127.0.0.1,P=3306,D=imdb,t=app_events_relevant,u=root --dry-run --ask-pass --no-check-foreign-keys --critical-load=Threads_running=200 --chunk-size=200

=========================
Operation, tries, wait:
  analyze_table, 10, 1
  copy_rows, 10, 0.25
  create_triggers, 10, 1
  drop_triggers, 10, 1
  swap_tables, 10, 1
  update_foreign_keys, 10, 1
No foreign keys reference `imdb`.`app_events_relevant`; ignoring --alter-foreign-keys-method.
Starting a dry run.  `imdb`.`app_events_relevant` will not be altered.  Specify --execute instead of --dry-run to alter the table.
Creating new table...
Created new table imdb._app_events_relevant_new OK.
Altering new table...
Altered `imdb`.`_app_events_relevant_new` OK.
Not creating triggers because this is a dry run.
Not copying rows because this is a dry run.
Not swapping tables because this is a dry run.
Not dropping old table because this is a dry run.
Not dropping triggers because this is a dry run.
2020-11-14T13:56:52 Dropping new table...
2020-11-14T13:56:52 Dropped new table OK.
Dry run complete.  `imdb`.`app_events_relevant` was not altered.

FLAGS:

if everything is ok, change --dry-run by --execute and the process will start.

--alter-foreign-keys-method=rebuild_constraints

This is the preferred approach for the reason it maintains the consistency of the schema and its relations. In this approach, before dropping the old table, it runs ALTER on all the child tables to drop existing FK and re-add new FK constraints that points to the columns from the new table (with the schema change in place).

--max-load default: Threads_running=25

Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than their thresholds. The option accepts a comma-separated list of MySQL status variables. An optional =MAX_VALUE (or :MAX_VALUE) can follow each variable. If not given, the tool determines a threshold by examining the current value and increasing it by 20%.

For example, if you want the tool to pause when Threads_connected gets too high, you can specify “Threads_connected”, and the tool will check the current value when it starts working and add 20% to that value. If the current value is 100, then the tool will pause when Threads_connected exceeds 120, and resume working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use either “Threads_connected:110” or “Threads_connected=110”.

The purpose of this option is to prevent the tool from adding too much load to the server. If the data-copy queries are intrusive, or if they cause lock waits, then other queries on the server will tend to block and queue. This will typically cause Threads_running to increase, and the tool can detect that by running SHOW GLOBAL STATUS immediately after each query finishes. If you specify a threshold for this variable, then you can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however; it will only give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time.

Best Practices:

You must have a stable network connection in order to run the tool with safety.
In case that you use k8s or docker it is recommended to create a new Linux container with the latest percona-tools installed.
Use TMUX terminal multiplexer in order to create a terminal session and have the ability to attach the running process again and pick that session up exactly from where you left it by simply attaching to that session.
In case that you have a Replication Database it would be nice to check the replication lag and adjust the chunk-size accordingly

Using TMUX:

tmux
pt-online-schema-change --progress=percentage,10 -- .......
===>
tmux ls
tmux attach-session -t 0

https://www.youtube.com/watch?v=Z8QsKBoRnP0 52:47 min

pt-online-schema-change

3. gh-ost

GitHub’s online schema migration for MySQL

gh-ost uses the binary log stream to capture table changes, and asynchronously applies them onto the ghost table. gh-ost takes upon itself some tasks that other tools leave for the database to perform. As result, gh-ost has greater control over the migration process; can truly suspend it; can truly decouple the migration’s write load from the master’s workload.

Using gh-ost for online schema change

gh-ost -allow-on-master -assume-rbr -exact-rowcount 
    -critical-load Threads_running=400 -critical-load-hibernate-seconds 60 
    -database ghost -max-load Threads_running=100 -nice-ratio 0.1 
    -chunk-size 5000 -ask-pass -table sbtest1 -user percona 
    -host revin-aurora.can0nprz8rtd.us-east-1.rds.amazonaws.com 
    -postpone-cut-over-flag-file /home/ubuntu/gh-ost-sentinel 
    -throttle-additional-flag-file /home/ubuntu/gh-ost-throttle_ghost.sbtest1 
    -alter 'ADD COLUMN ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP' 
    -verbose -execute 2>&1 | tee gh-ost.log

Example:

./gh-ost --allow-on-master -assume-rbr -exact-rowcount --chunk-size=5000 --host=10.1.0.132:3316 --database=sandbox_vcsl --table=notification_task  --user=root -ask-pass  --alter="'CHANGE COLUMN 'trigger' 'trigger' enum('a','b','c','d','e','hq');'" --verbose

--assume-rbr parameter is required to prevent gh-ost from trying to modify the binlog_format. It is not possible to do so directly in RDS/Aurora, as SUPER privilege is not available to end users.

-allow-on-master simply indicates to the tool to run everything against the master.

-database , -host , -user , -ask-pass and -table are the target instance, credentials and table to make the modifications on.

-critical-load , -max-load tell gh-ost to watch for these statuses and either abort the process when critical thresholds are reached, or throttle itself when max values are reached.

-postpone-cutover-flag it prevents the tool from switching to the new table as soon as the copy process is complete. This gives the operator better control of when the switch would occur e.g. off-peak hours. The tool will create this file when the option is specified.

-throttle-additional-flag-file when this file exists, gh-ost pauses both applying binary logs and copying rows, be careful if you have short binary log retention periods.

-nice-ratio allows the tool to artificially throttle per every rows copied. You can interpret this as percentage relative to last chunk copy time. If the last chunk copy time took 10 seconds, the tool will sleep about 1 second out of 10 seconds. Depending on the traffic on the table, you can be as aggressive as you see fit.

How to debug the Alter Process

1. SHOW PROCESSLIST

The process list is the list of connections, or threads, that are currently connected to MySQL. SHOW PROCESSLIST lists the threads, with information about each thread’s status

SHOW FULL PROCESSLIST;
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != 'Sleep';

2. INFORMATION_SCHEMA

The INFORMATION_SCHEMA database is a set of system views defined in the SQL standard.

The beauty of the INFORMATION_SCHEMA views is that you can query them with standard SQL. This gives you much more flexibility than the SHOW commands, which produce result sets that you can’t aggregate, join, or otherwise manipulate with standard SQL. Having all this data available in system views makes it possible to write interesting and useful queries.

-- INNODB Metrics
SELECT * FROM INFORMATION_SCHEMA.INNODB_METRICS;

-- Shows all queries running for 5 seconds or more:
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != 'Sleep' AND TIME >= 5;

-- Show all running UPDATEs:
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != 'Sleep' AND INFO LIKE '%UPDATE %';

3. PERFORMANCE_SCHEMA

SELECT * FROM `performance_schema`.`threads`;

querying the Performance Schema events_stages_current table.

SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM performance_schema.events_stages_current;

The stage event shown differs depending on which ALTER TABLE phase is currently in progress. The WORK_COMPLETED column shows the work completed.
The WORK_ESTIMATED column provides an estimate of the remaining work.
The events_stages_current table returns an empty set if the ALTER TABLE operation has completed. In this case, you can check the events_stages_history table to view event data for the completed operation.

SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM performance_schema.events_stages_history;

Useful Commands for this process

1. Find the Size of Databases

SELECT table_schema AS "Database", 
ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS "Size (MB)" 
FROM information_schema.TABLES 
GROUP BY table_schema;

2. Find the Size of Tables

SELECT
  TABLE_NAME AS `Table`,
  ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024) AS `Size (MB)`
FROM
  information_schema.TABLES
WHERE
  TABLE_SCHEMA = "TalkingData"
ORDER BY
  (DATA_LENGTH + INDEX_LENGTH)
DESC;

3. Export

mysqldump --add-drop-table -h SERVER -u USERNAME -p DB TABLE > talkingData.sql

4. Import

mysql -u root -p --host=127.0.0.1 --port=3306 imdb < talkingData.sql

Big dataset for Testing

https://archive.org/details/stackexchange

https://www.percona.com/blog/2011/02/01/sample-datasets-for-benchmarking-and-testing/

References:

High Performance MySQL, Second Edition, by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy D. Zawodny, Arjen Lentz, and Derek J. Balling

https://www.percona.com/blog/2018/06/07/using-gh-ost-with-amazon-aurora-for-mysql/

https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-change.html

https://github.com/github/gh-ost/

https://github.com/github/gh-ost/blob/master/doc/rds.md

https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.html

https://dev.mysql.com/doc/refman/5.7/en/innodb-online-ddl-operations.html