MySQL InnoDB Data Dictionary – How to Limit Size

innodbMySQL

I would like downgrade AWS instance to save the costs however due to the number of tables and workload (logical backups) InnoDB data dictionary grows extremely large.

mysql> SELECT COUNT(*) FROM information_schema.INNODB_SYS_TABLES;
+----------+
| COUNT(*) |
+----------+
|  1020034 |
+----------+
1 row in set (4.61 sec)

mysql> SELECT COUNT(*) FROM information_schema.INNODB_SYS_INDEXES;
+----------+
| COUNT(*) |
+----------+
|  2628181 |
+----------+
1 row in set (4.99 sec)

mysql> SHOW GLOBAL VARIABLES LIKE 'table_%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| table_definition_cache     | 400   |
| table_open_cache           | 1     |
| table_open_cache_instances | 1     |
+----------------------------+-------+
3 rows in set (0.00 sec)

mysql> SHOW GLOBAL STATUS LIKE 'Open%table%';
+--------------------------+----------+
| Variable_name            | Value    |
+--------------------------+----------+
| Open_table_definitions   | 400      |
| Open_tables              | 1        |
| Opened_table_definitions | 2312885  |
| Opened_tables            | 22403462 |
+--------------------------+----------+
4 rows in set (0.00 sec)

mysql> SHOW GLOBAL STATUS LIKE 'Innodb_mem_dictionary';
+-----------------------+------------+
| Variable_name         | Value      |
+-----------------------+------------+
| Innodb_mem_dictionary | 4517841711 |
+-----------------------+------------+
1 row in set (0.00 sec)

table_definition_cache

For InnoDB, table_definition_cache acts as a soft limit for the number
of open table instances in the InnoDB data dictionary cache. If the
number of open table instances exceeds the table_definition_cache
setting, the LRU mechanism begins to mark table instances for eviction
and eventually removes them from the data dictionary cache.

Any idea why this is not happening? I was wondering if enebling innodb_file_per_table would help in this case.

Some extra info:

mysql> SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_size';
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| innodb_buffer_pool_size | 8589934592 |
+-------------------------+------------+
1 row in set (0.00 sec)

mysql> SELECT @@version, @@version_comment;
+--------------------+---------------------------------------------------------------------------------------------------+
| @@version          | @@version_comment                                                                                 |
+--------------------+---------------------------------------------------------------------------------------------------+
| 5.6.30-76.3-56-log | Percona XtraDB Cluster (GPL), Release rel76.3, Revision aa929cb, WSREP version 25.16, wsrep_25.16 |
+--------------------+---------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

# free -m
              total        used        free      shared  buff/cache   available
Mem:          14881       14239         326           6         315         378
Swap:          4095        1237        2858

# ps -o %mem,rss,vsize -C mysqld
%MEM   RSS    VSZ
94.2 14364148 16262404

Best Answer

1 million tables ? !! That, alone, is likely to choke MySQL. What are you doing?

  • Creating a table for each 'user' or 'company' or whatever? That has to stop.
  • Creating lots of partitions in tables? That is inefficient.

Run this; it might give a clue of whether it is tables, partitions, or what-not:

( select * from information_schema.INNODB_SYS_TABLES
      order by rand() limit 33 ) ORDER BY name;

Patterns of database or table names may be obvious. Names ending like #p#p200805 are partition names. If it's tables or databases, let's discuss how to do that. Please provide some hints of what your tables are. Pending that, I will provide some generic suggestions.

Instead of having one-table-per-entity, have one table. In it have an id that indicates which entity is involved. ("Entity" is "user", "company", or whatever you are creating new tables based on.)

Then, to get away from one-table-per, you will need a lot of disk space and downtime to copy the million tables into a handful, adding the id as you go.

If it's partitions, let's discuss that. More than about 50 partitions in a single table leads to inefficiencies. Tables with fewer than a million rows should not (in general) be partitioned. Please provide SHOW CREATE TABLE, since there are many possible issues. I give some answers here .

Caveat: Much of what I have said will be invalid with version 8.0 and it's integrated Data Dictionary. Lots of databases or tables or partitions is not much of a problem.