Mysql – InnoDB: secondary key that’s ‘parallel’ to primary–do joins wherein the resulting primary keys are contiguous perform better

indexMySQLprimary-key

Say I have a table

(
person_id INT AUTO_INCREMENT, 
birthday_timestamp BIGINT, 
country, 
state, 
..., 
PRIMARY KEY (person_id), 
KEY (birthday_timestamp)
)

where people are added the instant they're born. In other words, the order of person_id is the same as that of birthday_timestamp. More formally, if I had two rows, A, and B,

A.person_id < B.person_id --> A.birthday_timestamp <= B.birthday_timestamp

Would it be reasonable to assume that performance on a WHERE by birthday_timestamp range should not be significantly worse than the equivalent WHERE on person_id range?

My rudimentary understanding of how this would be executed is the engine would fetch the list of person_id in the secondary index on birthday_timestamp and then fetch those results from the clustered index. Since these person_ids will be 100% contiguous, this should lead to minimal fetching of clustered index pages, right?

In contrast, if instead of birthday_timestamp, I had something whose order was completely irrelevant…say, favorite_number, this wouldn't be the case. In general, the person_id's corresponding to favorite numbers between 15 and 1,000,000 would be completely spread out and non-contiguous.

Back to the former case. The one alternative I can think of that wouldn't involve a join at all is the following. Say I want person_id, country, etc., for people born between 0 and 12345678. I could do

select * from people where birthday_timestamp >= 0 and birthday_timestamp <= 12345678

Which is what I described above and I assume would necessitate a join since MySQL doesn't actually know the keys are parellel, OR I could do two (almost)-point queries into the secondary index to grab the primary key range myself, i.e.

start = $(select person_id from people where birthday_timestamp >= 0 order by birthday_timestamp asc limit 1)

end = $(select person_id from people where birthday_timestamp <= 12345678 order by birthday_timestamp desc limit 1)

select * from people where person_id >= start and person_id <= end

Which seems like it could be a bit more optimal, though not by an order of magnitude. The downside is it's less concise.

Any thoughts?

Thanks

Best Answer

a) .. engine would fetch the list of person_id in the secondary index on birthday_timestamp and then fetch those results from the clustered index .. - Only InnoDB uses clustering by PK, MyISAM uses HEAP tables (no clustering) and the primary key is just another index with all indexes using pointer/offset to the heap to find the right row.

b) sequential vs random order of fetches only matters when you read data pages from disk, but when your query by birthday fetches first row from new page, innodb loads the entire page anyway and fetching next row of following born is "free" - no read from disk is needed because the page has already been loaded - it does not matter that mysql does not know the primary and secondary indexes are "parallel", it will have the right effect even without this knowledge - but the effect might be diminished if you have long rows so not many of them in one page. And MySQL won't use such index if it leads to reading more than ~20% of the table because it will prefer the clustered index for full scan in such case.

c) clustering by primary key does not actually mean that all the rows are stored totaly sequentialy in some file. It means that the table is organized as a B-Tree ordered by the PK. That B-Tree is stored as separate pages (nodes), each one containing some number of rows (depending on row size). So one page contains rows whose PK values are close together. But there is no guarantee that the logically next page will be stored physically next to the previous one. It may be "anywhere" in the table file (or in the big ibdata file if tables are not stored separately). When you only append to the table and never delete, you may expect that the new page will usually be allocated at the end so after the previous one, but there are possibilities of page splits even in that case and more so if the records are updated and deleted. That leads to "fragmentation" and non-continuous storage of the B-Tree pages. Afaik there is no direct defragmentation the way MyISAM offers (but rebuild might do that) for this but from what I gathered in the most extreme cases it may slow down table scans significantly. This is separate from file-level fragmentation but is very similar to that.

Mostly I agree with your assesment of the situation, the 3-query version looks promising and might be worth benchmarking for some cases. But you asked for some thoughts and this is what I got.