Sql-server – Why would adding an index on a MySQL table slow it down significantly but ok on SQL Server and PostgreSQL

MySQLperformancequery-performancesql server

Version MySQL 5.7.2

I'm working with dummy data with int columns.

time: 1-10mm

products: 70,000 random ints with another 30,000 that are dupes from the 70,000

volume: ranges from 500 – 1000

price: ranges from 10 – 50 though the price stays within bounds of 1-5 diff for each product row data.

from the above, 10mm rows are created by randomly selecting a product and generating the required row data

running a range query like…

select * from productdata where product >= 1500 and product <= 2000

takes about 4 seconds.

When I add an index on product using…

create index productindex on productdata(product)

The query now takes about 30 seconds. Time is the only unique column in the table but setting that as the primary key does not help either.

On SQL Server and PostgreSQL I don't see the same issues with the same data and query using a non clustered index in each. I only really have experience with writing queries for SQL Server so a bit perplexed by this. I tried PostgreSQL as well to have another db to compare to.

All databases are the latest stable versions available.

Action Output (I had to decrease the range as the original query was taking too long)…

Action Output

Without Index..

Without Index

With Index..

With index

Table status…

table status

Buffer…

enter image description here

Explain Select…

enter image description here

Show create table…

CREATE TABLE `products` (
  `time` int(11) DEFAULT NULL,
  `product` int(11) DEFAULT NULL,
  `quantity` int(11) DEFAULT NULL,
  `price` int(11) DEFAULT NULL,
  KEY `productindex` (`product`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Trying with time and product as key…

enter image description here

Best Answer

MyISAM? InnoDB? What is the cache size? (SHOW VARIABLES LIKE '%buffer%';) Did you run the query a second time (to cancel out the affect of caching)? Do you have TEXT or BLOB columns? (Please provide SHOW CREATE TABLE.) How much RAM? How big (GB) is the table (SHOW TABLE STATUS).

Probably... Definitely... The cache for the data was not large enough to hold the products in question. And, since the products are mostly "random", there is a lot of I/O.

Here's what happens in any(?) database vendor for that query:

  1. Locate the 'row' for 1500 in the index.
  2. Scan forward through the index until 2000. (Due to BTree organization in MySQL, this is quite efficient.)
  3. For each row, reach over into the "data" to get all the columns (SELECT *). (This is where vendors vary. And "Engines" in MySQL will vary.) Since the data rows are in one order, but the index rows are in a different order, MySQL will be effectively doing 'random I/O' to get the rows. (MySQL fetches the rows as needed. Other vendors may sort the row-addresses first -- perhaps a benefit, perhaps a cost.)
  4. For MySQL with InnoDB and large TEXT/BLOB columns, those columns may be stored elsewhere, thereby necessitating an extra I/O. (Hence the admonition not to use *, but to spell out only the necessary columns.)

But...

If "too much" of the table needs to be fetched -- in particular more than about 20% of the table has product in the desired range -- MySQL will do a table scan instead of using the index. (I don't know about non-MySQL vendors.) This optimization is usually beneficial, but sometimes is a mistake. The EXPLAIN SELECT .. would tell us which it did.

InnoDB "reaches into the data" via the PRIMARY KEY which is "clustered" with the data. So it is a BTree-probe to find each row. MyISAM has a byte address into the data file, so it is a fseek. Other vendors work differently.

So... Is the comparison "fair"? Do you have the PK "clustered" with the data on all vendor tests? (Etc)

And I did not get into the caching details. This is probably a major component of the slowdown.

I'll fill in more details after you provide more details.

After details

You are using InnoDB, not MyISAM (good). innodb_buffer_pool_size = 8M is much too small. For 16GB of RAM, recommend 11G. But even 1G would show a significant speedup since the table is smaller than that.

You say you are running MySQL 5.7, yet the 8M default contradicts that. Did you override the default? Please provide the version (SELECT @@version;).

It seems that there are no big TEXT or BLOB columns, so my comments on such do not apply.

Bottom line: To get a 'fair' comparison, increase the buffer_pool setting for MySQL. This is the most important tunable for performance. It is not automatically set because it depends on the amount of available RAM.