Mysql – Why is this MySQL query taking so long

MySQLmysql-8.0

I'm looking to do a relatively simply filter from one table called prices.

prices is a very large table (~2G records). Sample data, query and indexes included below. I realize the table being queried is very large, but for a query this simple I would have expected better performance (currently ~5 mins and running). I did notice InnoDB Buffer Usage appears to be at 100% per MySQL Workbench, with ~6K InnoDB reads per second. Guidance on adjustments to make to correct this.

prices

dataDate     ticker  optionSymbol    expDate     type    price   strike  last    bid     ask     volume  OI
2002-02-08   AAPL    AAQ020216C00005000 2002-02-16   call   24.03   5   0   18.8    19.1    0   0
2002-02-08   AAPL    AAQ020216P00005000 2002-02-16   put    24.03   5   0   0   0.05    0   0
2002-02-08   AAPL    AAQ020216C00007500 2002-02-16   call   24.03   7.5 0   16.3    16.6    0   0
2002-02-08   AAPL    AAQ020216P00007500 2002-02-16   put    24.03   7.5 0   0   0.05    0   0
2002-02-08   AAPL    AAQ020216C00010000 2002-02-16   call   24.03   10  12.2    13.9    14.2    0   1
2002-02-08   AAPL    AAQ020216P00010000 2002-02-16   put    24.03   10  0   0   0.05    0   0
2002-02-08   AAPL    AAQ020216C00012500 2002-02-16   call   24.03   12.5    13.5    11.4    11.7    0   8
2002-02-08   AAPL    AAQ020216P00012500 2002-02-16   put    24.03   12.5    0.05    0   0.05    0   50
2002-02-08   AAPL    AAQ020216C00015000 2002-02-16   call   24.03   15  7.1 8.9 9.1 0   10
2002-02-08   AAPL    AAQ020216P00015000 2002-02-16   put    24.03   15  0.1 0   0.05    0   30
2002-02-08   AAPL    AAQ020216C00017500 2002-02-16   call   24.03   17.5    5.5 6.4 6.7 0   371
2002-02-08   AAPL    AAQ020216P00017500 2002-02-16   put    24.03   17.5    0.05    0   0.05    0   147
2002-02-08   AAPL    AAQ020216C00020000 2002-02-16   call   24.03   20  3.9 3.9 4.1 7   1064
2002-02-08   AAPL    AAQ020216P00020000 2002-02-16   put    24.03   20  0.1 0   0.1 5   1448
2002-02-08   AAPL    AAQ020216C00022500 2002-02-16   call   24.03   22.5    1.7 1.7 1.75    1551    7069
2002-02-08   AAPL    AAQ020216P00022500 2002-02-16   put    24.03   22.5    0.2 0.15    0.25    136 3234
2002-02-08   AAPL    AAQ020216C00025000 2002-02-16   call   24.03   25  0.3 0.1 0.35    105 4237
2002-02-08   AAPL    AAQ020216P00025000 2002-02-16   put    24.03   25  1.25    1.2 1.35    629 589
2002-02-08   AAPL    AAQ020216C00027500 2002-02-16   call   24.03   27.5    0.05    0   0.1 0   1097

Query

select *
from op.prices op
where ticker = 'AAPL'
and '2020-04-30' between date_add(expDate, INTERVAL 3 MONTH) and expDate
and '2020-04-30' = date_add(op.dataDate, INTERVAL 14 DAY);

Indexes
enter image description here

Best Answer

Instead of

'2020-04-30' = date_add(op.dataDate, INTERVAL 14 DAY);

Use

op.dataDate = date_sub('2020-04-30', INTERVAL 14 DAY);

Your first statement will be interpreted as "add 14 days to all dataDate and return when that is 2020-04-30." This will require a full scan of the table.

The second statement will evaluate to: "return records where the dataDate is 2020-04-16." This allows the engine to perform a seek on your index that begins with dataDate.

Do whatever weird stuff you want to do to expDate since that won't factor much into how the query engine will optimize.