Mysql – How to measure amount of MySQL 8 INNODB table used to store blobs

amazon-rdsdisk-spaceinnodbMySQLmysql-8.0

I'm trying to understand a discrepancy in the file size on disk of my tables and the size of the information stored in them.

I'm using MySQL 8 hosted on RDS (InnoDB, one file per table), so getting direct access to the disk is unfortunately impossible.

One table in particular demonstrates the problem well:

mysql> describe prod_insti.workItemContentVersion;
+-----------------------------+--------------+------+-----+---------+-------+
| Field                       | Type         | Null | Key | Default | Extra |
+-----------------------------+--------------+------+-----+---------+-------+
| parentId                    | int(11)      | NO   | PRI | NULL    |       |
| serialNumber                | int(11)      | NO   | PRI | NULL    |       |
| previousVersionSerialNumber | int(11)      | YES  |     | NULL    |       |
| authorPrincipalName         | varchar(256) | YES  |     | NULL    |       |
| textAnswer                  | longtext     | YES  |     | NULL    |       |
| textAnswerContentType       | varchar(10)  | YES  |     | NULL    |       |
| blobAnswer                  | longblob     | YES  |     | NULL    |       |
| contentSavedOn              | datetime     | YES  |     | NULL    |       |
+-----------------------------+--------------+------+-----+---------+-------+

This answer indicates that the correct way to get the size of an InnoDB table is like so:

mysql> SELECT
    ->     table_name AS `Table`,
    ->     round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
    -> FROM information_schema.TABLES
    -> WHERE table_schema = 'prod_insti'
    -> AND table_name = 'workItemContentVersion';
+------------------------+------------+
| Table                  | Size in MB |
+------------------------+------------+
| workItemContentVersion | 1398809.50 |
+------------------------+------------+

However, my actual disk usage reported through RDS is way higher than this. I went trawling through the information_schema tables and eventually found the following information, which aligns much more closely with the observed disk usage:

mysql> select name, row_format, round(file_size / 1024 / 1024) as 'File size in MB', round(allocated_size / 1024 / 1024) as 'Allocated size in MB' from information_schema.INNODB_TABLESPACES where name = 'prod_insti/workItemContentVersion';
+-----------------------------------+------------+-----------------+----------------------+
| name                              | row_format | File size in MB | Allocated size in MB |
+-----------------------------------+------------+-----------------+----------------------+
| prod_insti/workItemContentVersion | Compressed |         2779672 |              2779677 |
+-----------------------------------+------------+-----------------+----------------------+

Scrutinizing the reference documentation here, it seems like the DATA_LENGTH attribute is actually the rounded-to-the-nearest-page length of the clustered index for an InnoDB table.

Now, it further seems (blog article, official docs) that when the ROW_FORMAT of an InnoDB table is DYNAMIC or COMPRESSED, BLOB and LONGTEXT columns are mostly not stored in the clustered index, but rather in their own 'overflow' section (or something? I'm not sure exactly what this really means).

So, my hypothesis is that DATA_LENGTH simply doesn't report the space used by the BLOB or TEXT columns in this table.

My question:

I'm looking for confirmation that this makes sense; and if so, can I sensibly assume that the disk size used by BLOB/TEXT columns in this table is equal to
allocated_size from information_schema.INNODB_TABLESPACES minus (data_length + index_length) from information_schema.TABLES? Or… are there other things that could be using this space?

If so, I should be able to get a lot of space back by replacing the blobs with pointers to files in S3.


I've also considered that the discrepancy might be due to unallocated space, e.g. if there were many deletes in the past. However, I think that would show up as DATA_FREE here, and it looks like it's negligible to me:

mysql> SELECT      table_name AS `Table`,   round(data_free / 1024 / 1024) as 'Data free in MB',   round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`  FROM information_schema.TABLES  WHERE table_schema = 'prod_insti' AND table_name = 'workItemContentVersion';
+------------------------+-----------------+------------+
| Table                  | Data free in MB | Size in MB |
+------------------------+-----------------+------------+
| workItemContentVersion |               2 | 1398809.50 |
+------------------------+-----------------+------------+

This makes sense to me as it's a rapidly-growing table and I wasn't aware of any deletes happening. Having a look for high DATA_FREE values in other tables I haven't found anything in the same order of magnitude as the DATA_LENGTH values, so I don't think this is my problem.

I also thought that maybe I need to run OPTIMIZE_TABLE on this table, but… I can't figure out what information would indicate that that would be necessary or helpful. Is an unusually high DATA_FREE count the indicator that you need to run OPTIMIZE_TABLE?

Best Answer

 SHOW TABLE STATUS LIKE 'workItemContentVersion';

will probably give you the 1.4TB that you got from the I_S query. But give it a try.

What is the value of innodb_page_size ? Normally, it is 16384, but between COMPRESSED and your two numbers differing by almost exactly a factor of 2, I wonder.

"Data_free" give only one of several "free" space in a table. So, alas, there is no decent metric to decide when to OPTIMIZE TABLE. Only in rare case, such as lots of deletes that won't be replaced, is OPTIMIZE worth doing. And your 1.4TB (or 2.8TB) table will take a long time.

What is the setting of innodb_file_per_table? It had better be 1, else the optimize will just make things worse. Even so, you will need a few TB to run the query.

2MB for Data_free is actually abnormally small. In virtually all tables I see, the the number is 4M-7M (or more).

The data and the PRIMARY KEY coexist in one BTree that is ordered by that PK. Each secondary index lives in a separate BTree, where the leaf 'rows' contain the PK.

Big blobs (and big texts, and sometimes big varchars) do not live in the above-mentioned BTrees; they are off in some other blocks. Some call this "off-record storage".

COMPRESSED is typically benchmarked at only 2x shrinkage. (Unlike most compress routines that shrink text/xml/code/etc by 3x.)

OPTIMIZE TABLE essentially copies the table over and rebuilds the indexes. For the data BTree, this is relatively straightforward, and it squeezes out most unused space from that BTree. The indexes may or may not shrink; I think they are rebuilt, not copied over.

If you need to do any maintenance on the table (add columns, change datatypes, etc), I suggest you simply do the ALTER. In some situations, it does the optimize work. Better, yet, use ALGORITHM=COPY would force it to do so.

I would hope that Data_length includes the blobs, but I don't have proof.

Deletes do not necessarily show up in Data_free. That metric only counts blocks. A BTree is a bunch of blocks. InnoDB makes some attempt to combine adjacent blocks after deletes. But, in the best of worlds, each BTree block is between 50% and 100% full, for an average of 69%. This is not reflected in Data_free.

So, for example, if you delete every third row in a table, Data_length and Data_free are not likely to change. Deleting 2 out of each 3 is likely to move some blocks from used to free.

Compression is at the block level. So, the buffer_pool tends to have two copies of many blocks -- both a compressed and an uncompressed copy. That is, disk space is saved, but the buffer_pool is impacted. I prefer to compress in the client rather than burdening the server. This offloads the CPU and provides 3x in more cases.

For rows with short text/blob, that data will be stored on-record. That is, the off-record storage is reserved for big columns. (This varies with the ROW_FORMAT, which I assume is COMPRESSED for your table.)