Index fragmentation is much overrated. Do not worry about it.
Two adjacent, somewhat-empty, blocks are merged together by InnoDB as the natural processing.
Random actions on a BTree cause it to naturally gravitate toward an average of 69% full. Sure, this is not 100%, but the overhead of "fixing" it is not worth it.
SHOW TABLE STATUS
gives you some metrics, but they are flawed -- "Data_free" includes certain "free" space, but not other "free" space.
There is unused space in each block; free 16KB blocks; free "extents" (nMB chunks); MVCC rows waiting to be reaped; non-leaf nodes have their own fragmentation; etc.
Percona and Oracle have different ways of looking at how big (number of blocks) an index is. I find neither of them to be useful because of the limited definition of "free". It seems that blocks (16KB each) are allocated in chunks (several MB), thereby leading one to believe that there is all sorts of fragmentation. In reality, it is usually just most of one of these multi-MB chunks. And OPTIMIZE TABLE
does not necessarily recoup any of the space.
If SQL Server is using BTrees, then it is lying to say that there is "no fragmentation". Think of what happens on a "block split". Or think of the overhead of continually defragmenting. Either way you lose.
Further note that a table and an index are essentially identical structures:
- B+Tree, based on some index
- The "data" is based on the PRIMARY KEY; each secondary index is a B+Tree based on its index.
- The leaf node of the "data" contains all the columns of the table.
- The leaf node of a secondary index contains the columns of that secondary index, plus the columns of the PRIMARY KEY.
If you have innodb_file_per_table = ON
, you can clearly see the shrinkage (if any) after OPTIMIZE TABLE by looking at the .ibd
file's size. For OFF
, the info is buried in ibdata1
, but SHOW TABLE STATUS
may be reasonably accurate since all "free" space belongs to every table. Well, except for the pre-allocated chunks.
You may notice that a freshly optimized file-per-table table has exactly 4M, 5M, 6M, or 7M of Data_free. Again, this is the pre-allocation, and the failure to give you the minute details.
I have worked with InnoDB for over a decade; I have worked with thousands of different tables, large and small. I say that only one table in a thousand really needs OPTIMIZE TABLE
. Using it on other tables is a waste.
105 columns is a lot, but perhaps not too many.
Do you have 55 indexes on one table? That is bad. That is 55 updates per INSERT
. Let's discuss that further. Keep in mind that INDEX(a)
is useless if you also have INDEX(a,b)
. And INDEX(flag)
is useless because of low cardinality. (But INDEX(flag, foo)
may be useful.)
Q1: There is no good way to check for all forms of fragmentation in either the data or the secondary indexes.
Q2, Q3: OPTIMIZE TABLE
rebuilds the table by CREATEing
a new table and INSERTing
all the rows, then RENAMEing
and DROPping
. The re-inserting of the data in PK order assures that the data is well-defragmented. The indexes are another matter.
Q4: You could DROP
and reCREATE
each index to clean it up. But this is an extremely slow process. 5.6 has some speedups, but I don't know if they help with defragmentation.
It is also possible to ALTER TABLE ... DISABLE KEYS
, then ENABLE
them. This may to a more efficient rebuild of all the secondary indexes at once.
There's a big difference between your query and the query the UI sends (which you can spy on using server-side trace, Extended Events, etc). In yours, the last argument to sys.dm_db_index_physical_stats
is DETAILED
:
... sys.dm_db_index_physical_stats( ..., 'DETAILED') ...
Whereas the UI is on a budget and sends SAMPLED
instead:
... sys.dm_db_index_physical_stats( ..., 'SAMPLED') ...
DETAILED
scans all of the pages, while SAMPLED
only scans 1 percent. That's right, 1%. You pay for more accuracy obviously, but relying on the info about every 100th page to make any judgments about the state of the index is, well, not very useful. You could always change your manual query to SAMPLED
if your goal is to just match the uselessness of the UI. :-)
(In fairness, this topic about the UI properties page should explicitly document the method that is used to determine fragmentation. I have a confirmation from Microsoft's Pedro Lopes that he will keep it in mind for an upcoming doc revision.)
Just never use or trust the UI. Never mind that it's a pain in the rear to look at more than one index at a time, it's also cumbersome to figure out what it's even doing, and you always have to worry about whether it is still doing what you used to think it did... it's just not worth the occasional expediency in looking up one thing rather than having a verifiable, repeatable query to run...
Best Answer
Imagine you have a heap and a record needs to be updated which causes it to grow. The new record size cannot fit on the DB page anymore. As a result a 'forwarding' record is created which is a pointer to which DB page the real record is. This is now fragmentation in your heap.
Thus, when you do a bookmark lookup on a heap with indexes on it you will still hit the forwarders which will cause more IO, and your indexes need to store the long RID. Removing that fragmentation based on the table will help reclaim size but almost always the internal storage engine prefers to work with a clustered index.
Background
Just for background there are a few kinds of fragmentation. The heap table itself might not be ordered, but it might have non clustered indexes on it which then need to reference the heap through a physical record ID (RID) which is a combination of file id, row id, and a few others. Basically where the record exists in the database page. This is not a logical RID. A logical RID would be like a unique clustered index and if your clustered index is not unique, it adds a uniqueifier similar to the RID. The DB engine would use this unique "ID" to find the record logically.
This is one reason a heaps row is sometimes wider than a table with a unique clustered index as it adds that info per record in your indexes.