Mysql – Range Optimizer in Mysql: What is an Index dive during Query Optimization

indexinnodbMySQLoptimizationstatistics

On following Jorgen Loland post on range optimization, I have come to the concept of "index dive" which is made during MySQL query planning (aka in Mysql Optimizer).

Quoting his point,

For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate and is, therefore, a good basis to make the best possible execution plan (QEP) for the query.

I have tested the same and they were accurate in my tests too.
I have a few questions,

  1. What does it exactly mean, "to find the start and end of the range"? Does this mean
    getting the start page and end page for range in the B-tree.
  2. If my first question is Yes, then how this technique will be accurate,i.e.,
    what about the pages.
    in the middle of the range.

Best Answer

To "dive" is to drill down the BTree from the 'root' down to the leaf node for a particular key.

Let's say you have a non-unique INDEX(x). Further, let's say that there are 100 rows with x=123. WHERE x = 123, the two dives would be to "find the first index entry for x=123" and to find the "last" such row.

His first use of "accurate" (in the article) was to say that the dives are more accurate than using statistics that may be stale.

I suspect his use of "accurate" is an exaggeration in the other uses. I say this because I cannot believe that counting the rows between the first and last dive is no more than an estimation. To get the exact number would require fetching each block -- this is too costly.

Note the 100 rows might be in the same block, or might (if the rows are wide) be spread across multiple blocks. For the purposes of EXPLAIN and query planning, an estimate is considered "good enough". The exact ("accurate") number can be quickly counted if the start and end are in the same block, but, if spread across multiple blocks, I think then it would only estimate the row counts for the intermediate blocks.

The "dives", even if they involve a disk-read, are usually not wasted. This is because you will (presumably) then proceed to perform the query and need those block(s) anyway.

Side note: Since Jørgen Løland wrote that article, the Optimizer has employed a "cost basis" algorithm. However, the "dives" are still important in it.