MySQL – Understanding the Relation Between Tablespaces, .frm & .ibd Files, and Database Pages

data-pagesinnodbMySQLtablespaces

I looked in the documentation but there was not a clear answer regarding their relation, can someone give an real life example of their relation in MySQL?

Database pages are smallest read write unit in databases and is usually 8KB, but:

  • What is the relation between MySQL pages and .frm & .ibd files?
  • Is .frm files consists of many pages coming together?
  • If so then does the .frm files have any type of info on the pages that are inside them?
  • Like the location of each page and stuff like that?
  • Where can i read more about this?
  • And what about tablespaces?
  • Is every .frm file a tablespace?

I read their documentation for hours but it got me even more confused

(I'm talking about the InnoDB storage engine BTW)

Best Answer

(@jynus has good answers; hopefully, I am adding to his answer, not repeating too much, and not disagreeing.)

Part of the confusion comes from the evolution of InnoDB.

Keep in mind that 90+% of MySQL users can totally ignore "tablespaces" and do just fine. Understanding them is either an academic exercise or, in a few cases, important for disk management.

In the beginning, on 32-bit machines, there was one tablespace, ibdata1. However, if that was not big enough (remember, some machines had a limit of 2GB for a file), there was ibdata2, etc. Technically, this was a set of "tablespaces", but it was not touted as such.

In any case, the concept of ibdata1 as the main place for InnoDB stuff lives on.

Initially, all blocks of all tables and indexes lived in ibdata1. A problem was, and still is, that space freed up was left as free blocks in ibdata1, and could not be given back to the OS.

To allow for putting a table on a different disk, and for allowing you to DROP or ALTER a table and give space back to the Operating System, innodb_file_per_table. So, at this point, ibdata1 contained miscellany stuff, plus (optionally) some tables. Various .ibd files each contained one table (including indexes), or one PARTITION of a table.

PARTITIONing allows for breaking a table into 'subtables', each acts almost entirely like a table.

This still does not allow for good disk management. A thousand tiny tables, stored as file_per_table, wastes a lot of disk space. Putting a different database on a different filesystem is a hassle. Etc. So, in comes a more generic "tablespace".

About the same time, the handling of "tmp" and "undo" space became important. In the past, those were handled somewhat monolithically in ibdata1. But there were mutex slowdowns, etc, so they got spread around.

So what to do with "tablespaces" in 5.7 and later?

Plan A: Ignore them. The defaults will probably be good enough.

Plan B: One tablespace per database. Unfortunately, there is not 'automatic' way to do this.

Plan C: Follow the 80-20 rule. For the 20% of the tables that will consume 80% of the disk space, use file_per_table. For the rest, have them go to ibdata1 (or tablespace-per-databse).

The history of the meta info is simpler:

Pre-8.0: The .frm file contained the schema, no data; ibdata1 contained some more info.

8.0+: The new "Data Dictionary" is in InnoDB tables. (The bootstrapping is scary.) It contains the schema and lots of other stuff. There is a file on disk in the unlikely event that the DD gets hosed and you need to reconstruct the schema.

Blocks:

Each tablespace (ibdata*, .ibd) has always been broken into blocks. Initially, there was one choice: 16KB. You may hear of "extents" of 1MB(?) or 8MB(?) -- this is merely pre-allocating a bunch of blocks. This was slightly beneficial for HDDs, but probably useless for SSDs.

While it is possible to have 4/8/16/32/64KB InnoDB blocks, 99+% of users stick with the default of 16KB. They are organized in a B+Tree based on some index (the PRIMARY KEY for the data, or the secondary key for secondary indexes). A convenient Rule of Thumb: 100 items are stored in a BTree block. Correlary: a million-row table (or index) will be about 3 levels deep.