MySQL – How to Improve Historical Data Query Performance

mariadbperformancequery-performance

I have a MySQL table, which stores exchange rates for two currencies, with a specific date. This table is updated every 15 second by an external script, and the data it contains is rather large, and will get even larger in time (since 2017 september, we have gathered ~8 million rows, but the updates at the beginning where with a 2 minute time scale).

The tables structure looks as follows:

 |---------------------------------------------------------|
 | RateId | RateTime | BaseCurrency | QuoteCurrency | Rate |
 |---------------------------------------------------------|

There are the following indexes in place on this table:

PRIMARY         BTREE   Yes No  RateId        7534199   A   No  
BaseCurrency    BTREE   No  No  BaseCurrency  5199      A   No  
                                QuoteCurrency 17938     A   No
QuoteCurrency   BTREE   No  No  QuoteCurrency 14        A   No  
RateTime        BTREE   No  No  RateTime      753419    A   No  

When I query current data, for example:

SELECT  *
    FROM  ExchangeRates
    WHERE  BaseCurrency = 'EUR'
      AND  QuoteCurrency = 'USD'
    ORDER BY  RateTime DESC
    LIMIT  0,1 

the query is very fast, around 0.0029 seconds.

When I try to query historical data, like:

SELECT  *
    FROM  `ExchangeTable`
    WHERE  CurrencyFrom = 'EUR'
      AND  CurrencyTo = 'USD'
      AND  ExchangeTime <  '2018-01-29 12:00:00'
    ORDER BY  ExchangeTime DESC
    LIMIT  0,1 

it is still fast, around 0.0008 seconds.

The problem starts when I try to query data, that we are not refreshing anymore, but I would still need historical information about. For example, we stopped gathering the rate for EUR->RON 6 months ago. If I would like to do the following query:

SELECT  *
    FROM  `ExchangeTable`
    WHERE  CurrencyFrom = 'EUR'
      AND  CurrencyTo = 'RON'
      AND  ExchangeTime <  '2018-01-29 12:00:00'
    ORDER BY  ExchangeTime DESC
    LIMIT  0,1 

it takes around ~9.5897 seconds to execute, with the row found dated in 2017-08-25 13:07:00.

Why is this query so slow, while the rest of the queries are fast? What could I do to improve these kind of queries?

We are using MariaDB, with InnoDB storage engine.

Best Answer

WHERE  BaseCurrency = 'EUR'
  AND  QuoteCurrency = 'USD'
ORDER BY  RateTime DESC

Needs

INDEX(BaseCurrency, QuoteCurrency, RateTime)

and

WHERE  CurrencyFrom = 'EUR'
  AND  CurrencyTo = 'USD'
  AND  ExchangeTime <  '2018-01-29 12:00:00'
ORDER BY  ExchangeTime DESC

needs

INDEX(CurrencyFrom, CurrencyTo, ExchangeTime)

Ditto for the third.

More discussion of how I came up with those indexes is in my Cookbook .

The reason for the sudden slowdown is buried in the size of the table, the value of innodb_buffer_pool_size, the number of rows that your too-short index touched, etc.

My indexes will be fast, regardless of any of those issues (for those queries).