oak-hook-general-log

NAME

oak-hook-general-log: hook up and filter general log entries based on entry type or execution plan criteria.

SYNOPSIS

Hook up the general log and dump everything:
oak-hook-general-log --user=root --socket=/tmp/mysql.sock
Only dump connect/disconnect events:
oak-hook-general-log --user=root --socket=/tmp/mysql.sock --filter-connection
Same as above, use defaults file:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-connection
Only dump queries which make for a full table scan and use the City table:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-fullscan --filter-explain-table=City
Only dump queries which make for a filesort:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-filesort
Only dump queries which make for an implicit temporary table creation:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-temporary
Only dump queries where the execution plan expects at least 100,000 rows in any of the tables:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-rows-exceed=100000
Only dump queries where the execution plan expects at least 1,000,000 rows combined:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-total-rows-exceed=1000000
Only dump queries where the execution plan uses given index (`first_name` index on `authors` table):
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-key=actor.first_name
Only dump queries where the execution plan contains given text anywhere (e.g. in key name, ref, etc.) This serves as a grep on the execution plan.
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-contains=my_column_name

DESCRIPTION

oak-hook-general-log hooks up to a MySQL >= 5.1 server, and dumps general log to standard output, allowing for sophisticated filtering.

Query filtering relies not only on query text (for which grep or awk are good tools) but rather on query type and query execution plan. It is possible to only dump connect/disconnect queries; queries which make for a table scan; queries which scan more than 100,000 rows; or queries which use a specific index.

The tool activates the server's general log, and instructs it to write to log table (mysql.general_log). The general_log table is periodically polled for new entries and rotated. Tool's output corresponds to the general_log table schema:

mysql-5.1.51> DESC mysql.general_log;
+--------------+------------------+------+-----+-------------------+-----------------------------+
| Field        | Type             | Null | Key | Default           | Extra                       |
+--------------+------------------+------+-----+-------------------+-----------------------------+
| event_time   | timestamp        | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| user_host    | mediumtext       | NO   |     | NULL              |                             |
| thread_id    | int(11)          | NO   |     | NULL              |                             |
| server_id    | int(10) unsigned | NO   |     | NULL              |                             |
| command_type | varchar(64)      | NO   |     | NULL              |                             |
| argument     | mediumtext       | NO   |     | NULL              |                             |
+--------------+------------------+------+-----+-------------------+-----------------------------+

A sample output may look like this:

2010-12-07 11:30:23     root[root] @ localhost []       8       1       Connect root@localhost on
2010-12-07 11:30:23     root[root] @ localhost []       8       1       Query   select 1
2010-12-07 11:30:39     root[root] @ localhost []       8       1       Query   show processlist
2010-12-07 11:30:45     root[root] @ localhost []       8       1       Query   desc mysql.general_log

Activation of the general_log, and in particular writing to the general_log table makes for considerable overhead. The tool is configured to automatically terminate after one minute from the time of execution, or as configured by --timeout-minutes. Upon termination the tool restores original general_log settings.

Interrupting the tool via Ctrl-C makes for a graceful termination, and original settings are restored. Any other means of termination (e.g. via kill -9 results with the general_log remaining active. Make sure to disable it!

The reason the general_log table is used is that the general_log file is in inconsistent format, and lacks some information such as user & host, which must be cross referenced to previous entries. This makes it impossible to diagnose existing connections when no log entries are present.

Even so, the general_log table lacks the current database for the given connection, information which is critical to understanding the context of the query (this information can be found in the slow_log table). To overcome this limitation, oak-hook-general-log cross-references the general log entries with PROCESSLIST entries. There may still be mismatches, but on most situations this should work well.

When asked to filter by execution plan criteria, oak-hook-general-log invokes an EXPLAIN query for each SELECT query encountered. This may further slow down your overall performance if execution plans are complicated. However, you should note it does not delay the execution of the query itself. In fact, the execution plan is calculated after the query is invoked.

It is possible to specify multiple filter criteria. For a query to be logged, it must answer for all specified filters.

OPTIONS

--ask-pass

Prompt for password.

--debug

Print stack trace on error.

--defaults-file=DEFAULTS_FILE

Read from MySQL configuration file. Overrides --user, --password, --socket, --port.

Configuration needs to be in the following format:

[client]
user=my_user
password=my_pass
socket=/tmp/mysql.sock
port=3306

--filter-connection

Only output connect/disconnect entries

--filter-explain-contains=FILTER_EXPLAIN_CONTAINS

Only output queries whose execution plan contains given text

--filter-explain-filesort

Only output queries where execution plan indicates filesort

--filter-explain-fulljoin

Only output queries where execution plan indicates full join

--filter-explain-fullscan

Only output queries where execution plan indicates

--filter-explain-indexscan

Only output queries where execution plan indicates full index scan

--filter-explain-key=FILTER_EXPLAIN_KEY

Only output queries where given key is used (specify key_name or table_name.key_name)

--filter-explain-rows-exceed=FILTER_EXPLAIN_ROWS_EXCEED

Only output queries where some path in the execution plan expects more than given number of rows scanned

--filter-explain-table=FILTER_EXPLAIN_TABLE

Only output queries where given table is used

--filter-explain-temporary

Only output queries where execution plan indicates use of temporary tables

--filter-explain-total-rows-exceed=FILTER_EXPLAIN_TOTAL_ROWS_EXCEED

Only output queries where execution plan expects at least as given total number of rows scanned

--filter-query

Only output queries

--filter-query-contains=FILTER_QUERY_CONTAINS

Only consider queries containing given text

-H HOST, --host=HOST

MySQL host (default: localhost)

--include-existing

Include possibly pre-existing entries in the general log table (default: disabled)

-p PASSWORD, --password=PASSWORD

MySQL password

-P PORT, --port=PORT

TCP/IP port (default: 3306)

-s SLEEP_TIME, --sleep-time=SLEEP_TIME

Number of seconds between log polling (default: 1)

-S SOCKET, --socket=SOCKET

MySQL socket file. Only applies when host is localhost

-t TIMEOUT_MINUTES, --timeout-minutes=TIMEOUT_MINUTES

Auto disconnect after given number of minutes (default: 1)

-u USER, --user=USER

MySQL user

-v, --verbose

Print user friendly messages

ENVIRONMENT

Requires MySQL 5.1 or newer, python 2.3 or newer. python-mysqldb must be installed in order to use this tool. You can
apt-get install python-mysqldb
or
yum install mysql-python

Please note that in MySQL versions < 5.1.46 you may run into this bug, in which MySQL allows for invalid general_log CSV tables. You may wish to either upgrade your MySQL version or issue:

mysql> ALTER TABLE mysql.general_log ENGINE=MyISAM;

SEE ALSO

LICENSE

This tool is released under the BSD license.
Copyright (c) 2008 - 2010, Shlomi Noach
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of the organization nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

AUTHOR

Shlomi Noach
 
openark kit documentation