MYSQL: How to optimize a date sorted table with a triple primary key index

dateindexMySQLoptimizationorder-by

I have been maintaining a MYSQL database for a few years, dating from back when i took database managing classes. Its biggest table has since grown (15 million entries for 1,8 GB) and this query began to take minutes to load (pkey1 is a DATETIME, pkey2 and pkey3 are int(11)) :

SELECT * FROM table 
WHERE pkey2 = ... 
  AND pkey3= ... 
  AND HOUR(pkey1) = 0 
ORDER BY pkey1 DESC

The index of this table is (pkey1,pkey2,pkey3) BTREE, and the column count is 20.

The ORDER BY used to be ASC and it was running quite slowly, I then changed it to DESC (since i will change this request for a pagination, if this cannot be optimized), and migrated from 5.7x to 8 to be able to recreate this table with a DESC index. It's now even slower.

The questions are :

  • Would changing the DATETIME to a DATE (and thus remove HOUR(0), no longer needed) resolve the problem in a satisfying enough way ? I don't need it anymore, especially if it causes problems.
  • If not, what can you advise me to do ?

To go further :

Allegedly because of this table, my whole system (weak: 2Gb RAM, Single core, 20GB SSD) is slowed down, and even a simple database connection is delayed by several seconds. If this table is not the case for the slow down (even though the triggering of this request seems to be the problem), here are the alarming variables i got from PHPmyadmin :

enter image description here

As you can see, things are probably wrong.

Thank you for the read !

Best Answer

Your query is not sargeable. Using Hour() function on the first column (pk1) of the composite primary key is inhibiting the usage of the Primary Key for index lookup; and thus (in absence of any other proper index), it is most likely doing full table scan. The general rule of thumb to follow while defining a working index is:

  1. First priority should be given to all the columns, which are inside the WHERE clause, and are connected by AND clause, and are compared to a constant value using either =, IS NULL, or <=>. So, in your query, there are two columns following this: Data_Type and Status.

  2. Second priority can be given to following three scenarios:

    • Column having Range condition.
    • Columns in a specific order in the GROUP BY clause (if existing).
    • Columns in a specific order in the ORDER BY clause (if existing)

Now, coming to your questions:

Would changing the DATETIME to a DATE (and thus remove HOUR(0), no longer needed) resolve the problem in a satisfying enough way ? I don't need it anymore, especially if it causes problems.

Yes, you should do that; if the time part is irrelevant to you. In MySQL 8+, the size of datetime is 5 bytes, while the size of date is 3 bytes. So you are effectively going to save considerable amount of space as well.

With this change, your new query would simply be:

SELECT * FROM table 
WHERE pkey2 = ... 
  AND pkey3= ... 
  -- removed AND HOUR(pk1) = 0 
ORDER BY pkey1 DESC

In this scenario, if you refer the thumbrule discussed above, you need to define a new Composite Index (pkey2, pkey3, pkey1) (the ordering is changed; you will need to keep this index besides the already defined Primary Key).

ALTER TABLE table_name ADD INDEX (pkey2, pkey3, pkey1);

This should be able to help you considerably.