redundant_keys

NAME

redundant_keys: List indexes which are made redundant (or duplicate) by other (dominant) keys.

TYPE

View

DESCRIPTION

redundant_keys finds and lists keys which are redundant; such that other existing keys can take their functionality, or that provide no better selectivity/cardinality than other existing keys.

It is in essence similar to Maatkit's mk-duplicate-key-checker, but provides information by utulizing a query instead of external scripts.

Exactly what a redundant or duplicate key is sometimes a matter of perspective. Listed below are sample cases that make or do not make for redundant keys. The trivial example where a key is redundant is when two identical keys are created. For example: KEY idx_1 (a), KEY idx_2 (a).

There is no argument that one of the keys above is redundant. However, the following case is somewhat different: KEY idx_1 (a), KEY idx_2 (a, b).

In the above, idx_1 is "covered" by idx_2. Anything idx_1 can do can also be done with idx_2, which indexes column 'a' and then some.

However, this is not the complete picture. While mathematically being redundant, we may actually desire to explicitly have idx_1 for performance reasons. Since idx_2 covers more columns, it is more bloated. For queries only searching through 'a', idx_1 may yield with better performance, since scanning through the index required less I/O (less pages to be scanned).

Moreover, for queries where idx_1 makes for a covering index (e.g. SELECT a FROM t WHERE a BETWEEN 200 AND 500, the difference may be even more significant, since we now only scan the index idx_1 and do not require access to the table. In the previous example, accessing the table added such overhead that made the difference between the two indexes smaller in comparison to total work.

The recommendations provided by redundant_keys only refer to the mathematical definition, and leave performance to the discretion of the user.

Terms:

  • KEY and INDEX are synonyms
  • A redundant or duplicate index is an index which is not mathematically required.
  • A dominant index is a key which makes another index redundant.

Important notes:

  • The view provides with a sql_drop_index column, making for SQL statement which drop the redundant keys. Do NOT take them for granted, or automate them back into MySQL. User is advised to double check all recommendations.
  • This view only considers B-Trees. This includes normal InnoDB & MyISAM indexes, and excludes FULLTEXT, HASH, GIS keys.
  • Subpart keys (indexing a prefix of a text column, e.g. KEY `name_idx` (`name` (10))) are not supported by this view. It provides with the subpart_exists column, to notify that subpart indexing is in use, but does not validate if and how redundancy is affected. User is advised to double & triple check recommendations. See examples below.
  • Circular listing of redundant keys may be possible and has not been thoroughly tested. That is, there may be groups of >= 3 keys, each making the other redundant, such that the view recommends to drop them all.

Sample cases where a key is redundant:

  • KEY idx_1 (a), KEY idx_2 (a): the trivial case with two identical keys. Either one can be dropped.
  • UNIQUE KEY idx_1 (a), KEY idx_2 (a): both index same column, but idx_1 also provides uniqueness. idx_2 is redundant.
  • UNIQUE KEY idx_1 (a, b), KEY idx_2 (a, b): same as above.
  • KEY idx_1 (a), KEY idx_2 (a, b): any query utilizing idx_1 can also utilize idx_2. idx_2 can answer for all 'a' issues. This makes idx_1 redundant. See preliminary discussion on explicit redundant keys for more on this.
  • KEY idx_1 (a), UNIQUE KEY idx_2 (a, b): same as above.
  • UNIQUE KEY idx_1 (a), KEY idx_2 (a, b): interestingly, idx_2 is redundant. To see why, note that idx_1 is UNIQUE. Since a UNIQUE key poses a constraint, which is not provided otherwise, idx_1 cannot be redundant. However, since any 'a' is unique, so is any ('a', 'b') combination. For any 'a' there can only be one 'b', since there can only be one 'a' at most. This means there is no point in further indexing column 'b' (unless for covering index purposes). There is no added value in terms of cardinality or selectivity.
  • UNIQUE KEY idx_1 (a), UNIQUE KEY idx_2 (a, b): continuing the above case, there is no need to declare that ('a', 'b') is UNIQUE, since 'a' is known to be unique. It follows that KEY suffices for idx_2. Thus it also follows that idx_2 is redundant.
  • KEY idx_1 (a), KEY idx_2 (a(10)): idx_2 only indexes 1st 10 characters of 'a'. idx_1 indexes all characters. idx_2 is redundant.
  • KEY idx_1 (a), UNIQUE KEY idx_2 (a(10)): idx_2 forces 1st 10 characters of 'a' to be UNIQUE. There is no added value to idx_1 (see preliminary discussion). idx_1 is redundant.
Sample cases where no key is redundant:
  • KEY idx_1 (a), KEY idx_2 (b): obviously, there's nothing in common between the two keys.
  • KEY idx_1 (a), KEY idx_2 (b, a): since order of column within an index is important, idx_2 cannot answer for 'a'-only queries (except by perhaps providing full index scan, outside our interest, see preliminary discussion). idx_1 therefore answers for queries not solvable by idx_2.
  • KEY idx_1 (a, b), KEY idx_2 (b, a): both indexes answer for different cases. On some access types, there is some form of waste in definition: in a many-to-many connecting table, where all queries use equality filtering (i.e. 'WHERE a = ?' as opposed to 'WHERE a > ?'), idx_1 may suffice with indexing 'a' only. However, with range condition this changes and both keys may be required.
  • UNIQUE KEY idx_1 (a, b), KEY idx_2 (b, a): same as above.
  • UNIQUE KEY idx_1 (a, b), UNIQUE KEY idx_2 (b, a): the UNIQUE constraint on either key is not strictly required. However this does not make the index itself redundant. As a side note, UNIQUE constraints are extremely helpful for MySQL query optimizer.
  • KEY idx_1 (a), KEY idx_2 (a(10), b): 'a' is text column, and idx_2 only indexes 1st 10 characters (subpart index). If any row contains more than 10 characters for 'a', idx_1 provides with indexing not supported by idx_2, and is therefore not redundant.
  • UNIQUE KEY idx_1 (a), KEY idx_2 (a(10), b): even stricter than the above.
  • KEY idx_1 (a), UNIQUE KEY idx_2 (a(10), b): idx_2 does NOT force any uniqueness on 'a' itself. It indexes less characters than idx_1. idx_1 is not redundant; nor is idx_2.

STRUCTURE

mysql> DESC common_schema.redundant_keys;
+----------------------------+--------------+------+-----+---------+-------+
| Field                      | Type         | Null | Key | Default | Extra |
+----------------------------+--------------+------+-----+---------+-------+
| table_schema               | varchar(64)  | NO   |     |         |       |
| table_name                 | varchar(64)  | NO   |     |         |       |
| redundant_index_name       | varchar(64)  | NO   |     |         |       |
| redundant_index_columns    | longtext     | YES  |     | NULL    |       |
| redundant_index_non_unique | bigint(1)    | YES  |     | NULL    |       |
| dominant_index_name        | varchar(64)  | NO   |     |         |       |
| dominant_index_columns     | longtext     | YES  |     | NULL    |       |
| dominant_index_non_unique  | bigint(1)    | YES  |     | NULL    |       |
| subpart_exists             | int(1)       | NO   |     | 0       |       |
| sql_drop_index             | varchar(223) | YES  |     | NULL    |       |
+----------------------------+--------------+------+-----+---------+-------+

SYNOPSIS

Columns of this view:

  • table_schema: schema of table with redundant index
  • table_name: table with redundant index
  • redundant_index_name: name of index suspected as redundant
  • redundant_index_columns: column covered by redundant index, comma separated, by order of definitions
  • redundant_index_non_unique: 0 if redundant index is UNIQUE; 1 if non-unique
  • dominant_index_name: name of index "covering for" the redundant index. This index is responsible for the redundant index to be redundant.
  • dominant_index_columns: column covered by dominant index, comma separated, by order of definitions
  • dominant_index_non_unique: 0 if dominant index is UNIQUE; 1 if non-unique
  • subpart_exists: 1 if either redundant or dominant keys use string subpart (indexing a prefix of a textual column); this calls for triple-check on the nature of both keys.
  • sql_drop_index: SQL statement to drop redundant index.
    Use with eval() to apply query.

EXAMPLES

Detect duplicate keys on sakila.actor:

mysql> ALTER TABLE `sakila`.`actor` ADD INDEX `actor_id_idx` (`actor_id`);

mysql> ALTER TABLE `sakila`.`actor` ADD INDEX `last_and_first_names_idx` (`last_name`, `first_name`);

mysql> ALTER TABLE `sakila`.`film_actor` ADD UNIQUE KEY `film_and_actor_ids_idx` (`film_id`, `actor_id`);

mysql> SELECT * FROM common_schema.redundant_keys \G
*************************** 1. row ***************************
              table_schema: sakila
                table_name: actor
      redundant_index_name: idx_actor_last_name
   redundant_index_columns: last_name
redundant_index_non_unique: 1
       dominant_index_name: last_and_first_names_idx
    dominant_index_columns: last_name,first_name
 dominant_index_non_unique: 1
            subpart_exists: 0
            sql_drop_index: ALTER TABLE `sakila`.`actor` DROP INDEX `idx_actor_last_name`
*************************** 2. row ***************************
              table_schema: sakila
                table_name: actor
      redundant_index_name: actor_id_idx
   redundant_index_columns: actor_id
redundant_index_non_unique: 1
       dominant_index_name: PRIMARY
    dominant_index_columns: actor_id
 dominant_index_non_unique: 0
            subpart_exists: 0
            sql_drop_index: ALTER TABLE `sakila`.`actor` DROP INDEX `actor_id_idx`
*************************** 3. row ***************************
              table_schema: sakila
                table_name: film_actor
      redundant_index_name: idx_fk_film_id
   redundant_index_columns: film_id
redundant_index_non_unique: 1
       dominant_index_name: film_and_actor_ids_idx
    dominant_index_columns: film_id,actor_id
 dominant_index_non_unique: 0
            subpart_exists: 0
            sql_drop_index: ALTER TABLE `sakila`.`film_actor` DROP INDEX `idx_fk_film_id`

ENVIRONMENT

MySQL 5.1 or newer

SEE ALSO

candidate_keys, no_pk_innodb_tables, sql_foreign_keys

AUTHOR

Shlomi Noach
 
common_schema documentation