Mysql – In MySQL if column X has unique values what’s the difference between UNIQUE index and B-Tree index

btreeindexMySQLunique-constraint

Let's assume that I have table data:

CREATE  TABLE `test`.`data` (
`row_id` INT NOT NULL ,
`data_id` INT NOT NULL ,
PRIMARY KEY (`row_id`) );

Now, let's assume that data_id has unique values. Is there any difference(space, performance, memory) between having B-Tree index on data_id and UNIQUE index on it? (besides the case when InnoDB is used as storage engine and UNIQUE index can be used as clustering key so row_id is uneccesary)

Best Answer

According to the Book

Understanding MySQL Internals

Page 198 says the following:

  • Paragraph 6 : A MyISAM B-tree consists of leaf and nonleaf nodes, or pages.
  • Paragraph 7 : Both leaf and nonleaf nodes key values and pointers to the record positions in the datafile. Nonleaf nodes additionally contains pointers to child nodes.

Given this description

  • a unique index would have more nonleaf nodes. This would lend itself mode towards ordering and searching.
  • nonunique index would requires less nonleaf nodes. This would lend itself mode towards doing range scans (tables and index)
  • A covering index (which would contain all needed columns for specific SELECT queries) would combine the best and worst of both. This would allow for a range scans that would need more nonleaf nodes and provide for ordering/searching. The additional benefit would be bypassing the need for reading table data if all needed columns are present and accounted for in the index.
  • I mentioned other aspects in my past post Benefits of BTREE in MySQL

SPACE

  • Unique Indexes would take up more space for nonleaf nodes.
  • Covering indexes would need more space than a nonunqiue index but would have a higher multi-columns index cardinality that can approach the actual row count of the table.

PERFORMANCE (Heavy-Write Environment)

  • SELECTs would be very fast for exact keys
  • SELECTs that only touch a Covering Index would be extremely fast
  • Random INSERTs can be a nightmare due to rotations/rebalancing of nonleaf nodes
  • UPDATEs that change key values can be a nightmare due to rotations/rebalancing of nonleaf nodes
  • The higher the cardinality, the more nonleaf nodes, the better the performance
  • The lower the cardinality, the fewer the nonleaf nodes, the worse the performance
  • The more columns in the index increases the effects of the cardinality

MEMORY

Depends on Storage Engine