Mysql – Where does extra bytes come from

disk-spaceMySQL

Here is my table:

+-------------+---------------+------+-----+---------+----------------+
| Field       | Type          | Null | Key | Default | Extra          |
+-------------+---------------+------+-----+---------+----------------+
| id          | int(11)       | NO   | PRI | NULL    | auto_increment |
| tckno       | bigint(20)    | YES  |     | NULL    |                |
| adid        | mediumint(9)  | NO   |     | NULL    |                |
| soyadid     | mediumint(9)  | NO   |     | NULL    |                |
| anaadid     | mediumint(9)  | NO   |     | NULL    |                |
| babaadid    | mediumint(9)  | NO   |     | NULL    |                |
| cinsiyet    | enum('E','K') | NO   |     | NULL    |                |
| dogumyeri   | mediumint(9)  | NO   |     | NULL    |                |
| dogumtarihi | date          | NO   |     | NULL    |                |
| nufusil     | tinyint(4)    | NO   |     | NULL    |                |
| nufusilce   | smallint(9)   | NO   |     | NULL    |                |
+-------------+---------------+------+-----+---------+----------------+

Now, according to https://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html my row size is

id(4) + tckno(8) + next 4 rows (4*3) + cinsiyet(1) + dogumyeri(3) + dogumtarihi(3) + nufusil(1) + nufusilce(2) = 34 bytes

I should have 48166478 (row count) * 34 = 1637660252 bytes database. But when I navigate to mysql/data/<database name> folder, I see that <table name>.idb file is 2826960896 bytes on disk. There is a discrepency of 2826960896 bytes - 1637660252 bytes ≈ 1.1GB. What is this extra data?

Here is what I get from exporting said table

CREATE TABLE `optimumtablo` (
  `id` int(11) NOT NULL,
  `tckno` bigint(20) DEFAULT NULL,
  `adid` mediumint(9) NOT NULL,
  `soyadid` mediumint(9) NOT NULL,
  `anaadid` mediumint(9) NOT NULL,
  `babaadid` mediumint(9) NOT NULL,
  `cinsiyet` enum('E','K') NOT NULL,
  `dogumyeri` mediumint(9) NOT NULL,
  `dogumtarihi` date NOT NULL,
  `nufusil` tinyint(4) NOT NULL,
  `nufusilce` smallint(9) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

ALTER TABLE `optimumtablo` ADD PRIMARY KEY (`id`);

ALTER TABLE `optimumtablo` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;

This is show table status output for said table;

+--------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| Name         | Engine | Version | Row_format | Rows     | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time          | Collation         | Checksum | Create_options | Comment |
+--------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+
| optimumtablo | InnoDB |      10 | Compact    | 47621825 |             57 |  2759852032 |               0 |            0 |   4194304 |       48166479 | 2015-09-11 20:53:45 | NULL                | NULL                | utf8_general_ci   |     NULL |                |         |
+--------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-------------------+----------+----------------+---------+

Best Answer

In InnoDB, each row has, at least, the following overhead:

  • A 5 byte row header,

  • A 6 byte transaction ID,

  • A 7 byte roll pointer, which points to the segment in the redo log that contains the necessary information to reconstruct an earlier version of the row

  • A variable-length bit flag vector of size CEIL(nullable_column_count/8) which indicates whether each nullable column for that row is null (in this table, this would require 1 byte per row).

https://dev.mysql.com/doc/refman/5.6/en/innodb-physical-record.html

These are going to account for much of your storage space, since your rows themselves are very short. On this particular table, the overhead is ~19 bytes per row, which is >50% of the size of the actual stored data. If you had more columns or longer columns the overhead would not grow much if at all, and would be much smaller (proportionally) when compared to the actual data.

InnoDB also tends to leave gaps at the end of each data page, reducing (ideally) the need for page splits, later. The size of the gap varies by whether rows are inserted in primary key me order, or not. With an AUTO_INCREMENT primary key, pages should fill to somewhere around 15/16ths.

https://dev.mysql.com/doc/refman/5.6/en/innodb-physical-structure.html