oak-hook-general-log
NAME
oak-hook-general-log: hook up and filter general log entries based on entry type or execution plan criteria.SYNOPSIS
Hook up the general log and dump everything:oak-hook-general-log --user=root --socket=/tmp/mysql.sockOnly dump connect/disconnect events:
oak-hook-general-log --user=root --socket=/tmp/mysql.sock --filter-connectionSame as above, use defaults file:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-connectionOnly dump queries which make for a full table scan and use the City table:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-fullscan --filter-explain-table=CityOnly dump queries which make for a filesort:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-filesortOnly dump queries which make for an implicit temporary table creation:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-temporaryOnly dump queries where the execution plan expects at least 100,000 rows in any of the tables:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-rows-exceed=100000Only dump queries where the execution plan expects at least 1,000,000 rows combined:
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-total-rows-exceed=1000000Only dump queries where the execution plan uses given index (`first_name` index on `authors` table):
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-key=actor.first_nameOnly dump queries where the execution plan contains given text anywhere (e.g. in key name, ref, etc.) This serves as a grep on the execution plan.
oak-hook-general-log --defaults-file=/home/myuser/.my-oak.cnf --filter-explain-contains=my_column_name
DESCRIPTION
oak-hook-general-log hooks up to a MySQL >= 5.1 server, and dumps general log to standard output, allowing for sophisticated filtering.
Query filtering relies not only on query text (for which grep or awk are good tools) but rather on query type and query execution plan. It is possible to only dump connect/disconnect queries; queries which make for a table scan; queries which scan more than 100,000 rows; or queries which use a specific index.
The tool activates the server's general log, and instructs it to write to log table (mysql.general_log). The general_log table is periodically polled for new entries and rotated. Tool's output corresponds to the general_log table schema:
mysql-5.1.51> DESC mysql.general_log; +--------------+------------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +--------------+------------------+------+-----+-------------------+-----------------------------+ | event_time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | | user_host | mediumtext | NO | | NULL | | | thread_id | int(11) | NO | | NULL | | | server_id | int(10) unsigned | NO | | NULL | | | command_type | varchar(64) | NO | | NULL | | | argument | mediumtext | NO | | NULL | | +--------------+------------------+------+-----+-------------------+-----------------------------+
A sample output may look like this:
2010-12-07 11:30:23 root[root] @ localhost [] 8 1 Connect root@localhost on 2010-12-07 11:30:23 root[root] @ localhost [] 8 1 Query select 1 2010-12-07 11:30:39 root[root] @ localhost [] 8 1 Query show processlist 2010-12-07 11:30:45 root[root] @ localhost [] 8 1 Query desc mysql.general_log
Activation of the general_log, and in particular writing to the general_log table makes for considerable overhead. The tool is configured to automatically terminate after one minute from the time of execution, or as configured by --timeout-minutes. Upon termination the tool restores original general_log settings.
Interrupting the tool via Ctrl-C makes for a graceful termination, and original settings are restored. Any other means of termination (e.g. via kill -9 results with the general_log remaining active. Make sure to disable it!
The reason the general_log table is used is that the general_log file is in inconsistent format, and lacks some information such as user & host, which must be cross referenced to previous entries. This makes it impossible to diagnose existing connections when no log entries are present.
Even so, the general_log table lacks the current database for the given connection, information which is critical to understanding the context of the query (this information can be found in the slow_log table). To overcome this limitation, oak-hook-general-log cross-references the general log entries with PROCESSLIST entries. There may still be mismatches, but on most situations this should work well.
When asked to filter by execution plan criteria, oak-hook-general-log invokes an EXPLAIN query for each SELECT query encountered. This may further slow down your overall performance if execution plans are complicated. However, you should note it does not delay the execution of the query itself. In fact, the execution plan is calculated after the query is invoked.
It is possible to specify multiple filter criteria. For a query to be logged, it must answer for all specified filters.
OPTIONS
--ask-passPrompt for password.
--debugPrint stack trace on error.
--defaults-file=DEFAULTS_FILERead from MySQL configuration file. Overrides --user, --password, --socket, --port.
Configuration needs to be in the following format:
[client]
user=my_user
password=my_pass
socket=/tmp/mysql.sock
port=3306
Only output connect/disconnect entries
--filter-explain-contains=FILTER_EXPLAIN_CONTAINSOnly output queries whose execution plan contains given text
--filter-explain-filesortOnly output queries where execution plan indicates filesort
--filter-explain-fulljoinOnly output queries where execution plan indicates full join
--filter-explain-fullscanOnly output queries where execution plan indicates
--filter-explain-indexscanOnly output queries where execution plan indicates full index scan
--filter-explain-key=FILTER_EXPLAIN_KEYOnly output queries where given key is used (specify key_name or table_name.key_name)
--filter-explain-rows-exceed=FILTER_EXPLAIN_ROWS_EXCEEDOnly output queries where some path in the execution plan expects more than given number of rows scanned
--filter-explain-table=FILTER_EXPLAIN_TABLEOnly output queries where given table is used
--filter-explain-temporaryOnly output queries where execution plan indicates use of temporary tables
--filter-explain-total-rows-exceed=FILTER_EXPLAIN_TOTAL_ROWS_EXCEEDOnly output queries where execution plan expects at least as given total number of rows scanned
--filter-queryOnly output queries
--filter-query-contains=FILTER_QUERY_CONTAINSOnly consider queries containing given text
-H HOST, --host=HOSTMySQL host (default: localhost)
--include-existingInclude possibly pre-existing entries in the general log table (default: disabled)
-p PASSWORD, --password=PASSWORDMySQL password
-P PORT, --port=PORTTCP/IP port (default: 3306)
-s SLEEP_TIME, --sleep-time=SLEEP_TIMENumber of seconds between log polling (default: 1)
-S SOCKET, --socket=SOCKETMySQL socket file. Only applies when host is localhost
-t TIMEOUT_MINUTES, --timeout-minutes=TIMEOUT_MINUTESAuto disconnect after given number of minutes (default: 1)
-u USER, --user=USERMySQL user
-v, --verbosePrint user friendly messages
ENVIRONMENT
Requires MySQL 5.1 or newer, python 2.3 or newer. python-mysqldb must be installed in order to use this tool. You canapt-get install python-mysqldbor
yum install mysql-python
Please note that in MySQL versions < 5.1.46 you may run into this bug, in which MySQL allows for invalid general_log CSV tables. You may wish to either upgrade your MySQL version or issue:
mysql> ALTER TABLE mysql.general_log ENGINE=MyISAM;
SEE ALSO
LICENSE
This tool is released under the BSD license.Copyright (c) 2008 - 2010, Shlomi Noach All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the organization nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.