oak-online-alter-table

NAME

oak-online-alter-table: Perform a non-blocking ALTER TABLE operation

SYNOPSIS

Add an index on the Population column, City table, world database:
oak-online-alter-table --database=world --table=City --alter="ADD KEY(Population)"
Same as above, specify database with table name:
oak-online-alter-table --table=world.City --alter="ADD KEY(Population)"
Perform multiple alters:
oak-online-alter-table --table=world.City --alter="ADD KEY(Population), MODIFY Name VARCHAR(64) CHARSET utf8 NOT NULL"
Add column, drop column, modify AUTO_INCREMENT:
oak-online-alter-table --table=world.City --alter="ADD COLUMN Mayor VARCHAR(64) CHARSET utf8 DEFAULT '', DROP COLUMN district, MODIFY id SMALLINT AUTO_INCREMENT"
Do not ALTER, just rebuild City:
oak-online-alter-table --database=world --table=City
Create a ghost table called City_mirror, which is updated online from City table:
oak-online-alter-table --database=world --table=City --alter="ADD KEY(Population)" --ghost=City_mirror
Perform ALTER in chunks of 5000 rows:
oak-online-alter-table --table=world.City --alter="ADD KEY(Population)" --chunk-size=5000
Perform ALTER in chunks of 5000 rows, sleep for 20 milliseconds between chunks:
oak-online-alter-table --table=world.City --alter="ADD KEY(Population)" --chunk-size=5000 --sleep=20
Perform ALTER, use sleep ratio of 2; sleep 2 seconds for every second spent working:
oak-online-alter-table --table=world.City --alter="ADD KEY(Population)" --sleep-ratio=2
Perform a cleanup for an aborted run:
oak-online-alter-table --database=world --table=City --cleanup
Provide connection parameters. Prompt for password:
oak-online-alter-table --database=world --table=City --alter="ADD KEY(Population)" --user=root --ask-pass --socket=/tmp/mysql.sock
Use a defaults file for connection parameters:
oak-online-alter-table --database=world --table=City --alter="ADD KEY(Population)" --defaults-file=/home/myuser/.my-oak.cnf

DESCRIPTION

oak-online-alter-table allows for non blocking ALTER TABLE operations, table rebuilds and creating a table's ghost.

To run this utility, the following requirements must be met:

  • The table has at least one single-column UNIQUE KEY
  • Altered table shares a single-column UNIQUE KEY with the original table
  • No ‘AFTER’ triggers are defined on the table (the utility creates its own triggers for the duration of the operation)
  • The table has no FOREIGN KEYs
  • Table name is no longer than 57 characters
The utility provides with three basic functionalities:
  • A non blocking ALTER TABLE operation: adding columns or keys, dropping columns or keys, modifying columns, converting table character set etc., are all supported, under the following restrictions:
    • ADD COLUMN (new column must have a default value)
    • DROP COLUMN (as long as there’s still a shared single-column UNIQUE KEY between the old table and the altered one)
    • MODIFY COLUMN (change type, including UNIQUE KEY columns)
    • ADD KEY (normal, unique, fulltext etc.)
    • DROP KEY (as long as there’s still a shared single-column UNIQUE KEY between the old table and the altered one)
    • Change ENGINE: works, but great care should be taken when dealing with non-transactional engines
    • Adding FOREIGN KEY constraints is possible
    • More… Not all ALTERS have been tested, e.g. CONVERT, partition handling…
  • An empty ALTER, which rebuilds a table: potentially releasing disk space and organizing rows in a more optimized way on disk.
  • (May not be supported in future versions): Creating a live 'ghost' table, which is synchronized with the original table, as long as the following do not occur:
    • An ALTER TABLE on the original table
    • A TRUNCATE on the original table
    • LOAD DATA INFILE into the original table
    • OPTIMIZE on the original table
While the utility runs, it is allowed to invoke INSERT, UPDATE, DELETE, REPLACE on the original table. It is not allowed to TRUNCATE, ALTER, REPAIR, OPTIMIZE or otherwise perform a table-wide operation on the original table.

The utility works with InnoDB tables, MyISAM tables, or otherwise any table-level locking storage engine (MEMORY, ARCHIVE).

The utility works by creating a ghost table for the duration of the runtime, which is slowly synchronized with the original table. When the synchronization is complete, it takes place of the original table. To do this, the utility must create AFTER INSERT, AFTER UPDATE, AFTER DELETE triggers on the original table.

Synchronizing the ghost table with the original tables takes place in several steps. In one of those steps, data is copied from the original table to the ghost table. This is done in chunks of rows, the number of which is configurable using the chunk-size option. While a chunk is being copied, there is a read lock on the table (MyISAM, ARCHIVE, MEMORY) or on the rows contained in the chunk (InnoDB). The smaller the chunk - the faster the locks are removed, and the more concurrency is allowed; but also the longer it will take for the entire operation to complete.

For write intensive application, it may be advisable to allow for pauses between chunks, so as to make as little impact as possible. This can be configured with the sleep or sleep-ratio options. While sleeping between chunks, no locks are being placed.

Even so, a performance impact is noticed while the application runs, and this is due to the triggers added on the table, and the propagation of DML statements to the ghost table. It is required to have enough disk space to accommodate the altered table (as in a normal ALTER TABLE). Only when the operation completes can there be a disk space regaining (depending on your storage engine and configuration).

OPTIONS

-a ALTER_STATEMENT, --alter=ALTER_STATEMENT

Comma delimited ALTER statement details, excluding the 'ALTER TABLE t' itself. When this options is not used, an 'empty' ALTER is performed, essentially rebuilding the table.

--ask-pass

Prompt for password.

-c CHUNK_SIZE, --chunk-size=CHUNK_SIZE

Number of rows to act on in chunks. Default: 1000. The lower the number, the shorter any locks are held, but the more operations required and the more total running time. Do not use very low values when the PRIMARY KEY, or otherwise the only UNIQUE KEY are on textual columns, as values from such keys are reused when working the chunks. If you're not sure - stick with the defaults.

--cleanup

Remove custom triggers, ghost table from possible previous runs. In case a previous run was abruptly terminated, this option removes all custom data this utility may have created. It is not necessary to run with this option after a normal completion.

-d DATABASE, --database=DATABASE

Database name (required unless table is fully qualified)

--defaults-file=DEFAULTS_FILE

Read from MySQL configuration file. Overrides --user, --password, --socket, --port.

Configuration needs to be in the following format:

[client]
user=my_user
password=my_pass
socket=/tmp/mysql.sock
port=3306

-g GHOST, --ghost=GHOST

Table name to serve as ghost. When this option is used, a table by this name is created and synchronized with the original table. The original table is thereafter unaltered, and the three AFTER INSERT, AFTER UPDATE and AFTER DELETE triggers are maintained. To perform an ALTER TABLE, do not use this option. [May be removed in future versions]

-H HOST, --host=HOST

MySQL host (default: localhost)

-l, --lock-chunks

Use LOCK TABLES for each chunk. This option enforces a higher locking mechanism, and is at current available as preparation to be able to work with unsupported engines. It is not required nor advisable to use this option with MyISAM or InnoDB engines. [May be removed in future versions]. -N, --skip-binlog

Disable binary logging; operation to only execute on master and not to propagate to slaves. By default this is disabled and ALTER oprations are propagated to slaves.

-r MAX_LOCK_RETRIES, --max-lock-retries=MAX_LOCK_RETRIES

Maximum times to retry a chunk in case of a deadlock or lock_wait_timeout. (default: 10; 0 is unlimited)

--skip-delete-pass

Do not execute the DELETE data pass. With InnoDB/MyISAM there is apparently no need for the DELETE pass; setting this flag reduces total runtime significantly. In the future this will be set by default.

-p PASSWORD, --password=PASSWORD

MySQL password

-P PORT, --port=PORT

TCP/IP port (default: 3306)

-q, --quiet

Quiet mode, do not verbose. Verbose is on by default, use this option to turn it off.

--sleep=SLEEP_MILLIS

Number of milliseconds to sleep between chunks. Default: 0. Use a higher value if your system suffers from high load average. The higher the value, the more time the system is allowed to 'rest', but also the longer the runtime, and the more overhead (over time) from the triggers effect. You may also wish to set a higher number if slaves are finding it hard otherwise to catch up.

--sleep-ratio=SLEEP_RATIO

Ratio of sleep time to execution time. Sleep time will be proportional to execution time per chunk, as opposed of being constant with --sleep. Default: 0 (no sleep)

-S SOCKET, --socket=SOCKET

MySQL socket file. Only applies when host is localhost

-t TABLE, --table=TABLE

Table with AUTO_INCREMENT column to alter (optionally fully qualified as database_name.table_name, in which case --database is not required)

-u USER, --user=USER

MySQL user

-v, --verbose

Print user friendly messages. Enabled by default.

ENVIRONMENT

Requires MySQL 5.0 or newer, python 2.3 or newer. python-mysqldb must be installed in order to use this tool. You can
apt-get install python-mysqldb
or
yum install mysql-python

SEE ALSO

LICENSE

This tool is released under the BSD license.
Copyright (c) 2008 - 2010, Shlomi Noach
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of the organization nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

AUTHOR

Shlomi Noach
 
openark kit documentation