Mariadb – 2 fast SQL queries get slow when bunched as subquery on a very large table

mariadbsubquery

I'm trying to use a subquery to speed up a query on a very large table right now 30 million row it most likely going to end up close to 1 billion rows.

My table lnmntr is as follow

  • Lnmntr_ndx: auto increment index,
  • Val: Numeric,
  • T_stamp: Time_Stamp indexed,
  • Row_id: Numeric

So my main query looks like this. This basically execute in no time. It select the 500 row following an autoincrement index

SELECT * 
FROM CLD_DB.lnmntr 
WHERE lnmntr_ndx >= 18411001 
ORDER BY lnmntr_ndx ASC 
LIMIT 500;

My subquery is used to find 1 row with the correct val. I've added a between timestamp here because I know around what time I'll find the row I need and since it's indexed it speed up thing. This query executes in about .4 seconds

SELECT Lnmntr_ndx 
FROM CLD_DB.lnmntr 
WHERE (t_Stamp BETWEEN '2016-06-05 16:22:04' AND '2016-06-05 16:23:04') 
  AND (Row_id IN (0,500,1000,1500,2000,2500,3000,3500,4000,4500,5000,
                  5500,6000,6500,7000,7500,8000,8500,9000,9500) 
  AND Val = 49092400); 

but when I bunch the 2 queries together, I end up with a 20 second query

SELECT * 
FROM CLD_DB.lnmntr 
WHERE lnmntr_ndx >= (SELECT Lnmntr_ndx 
                     FROM CLD_DB.lnmntr
                     WHERE (t_Stamp BETWEEN '2016-06-05 16:22:04' AND '2016-06-05 16:23:04') 
                       AND (Row_id IN (0,500,1000,1500,2000,2500,3000,3500,4000,4500,5000,
                        5500,6000,6500,7000,7500,8000,8500,9000,9500) 
                       AND Val = 49092400)
                     )
ORDER BY lnmntr_ndx ASC 
LIMIT 500;

I could run the 2 queries back to back to speed up thing but it would be much easier to run a single query. And id really like to know why they get this slow when bunched together

Best Answer

I'm not an expert on mysql/mariadb, but I suspect that it is running the sub-query one for every row in CLD_DB.lnmntr, or at least every row it needs to until it finds 500 that match the condition (which might just be 500 or might be far more depending on your data). Some SQL engines are bright enough to see that the inner query is invariant within the context of the current statement and only run it once, but IIRC mariadb's query planner doesn't do that sort of thing.

Assuming you have an index on CLD_DB.lnmntr your first query will result in a seek to the first point where lnmntr_ndx >= 18411001 then an ordered partial scan until 500 rows are found (or it hits the end). Even if the inner query were pretty much instantaneous so there would be very little difference between running it once and running it some hundreds or thousands of times, if it needs to be run for each row then you are forcing it to scan down the index from the beginning until 500 rows after the point where lnmntr_ndx is greater than the value.

So it could be either or both of these factors. To confirm which use EXPLAIN to look at the query plan, as suggested by Nick, and edit your question to add the resulting information. It would also be useful to provide the relevant table definitions (columns and their types, and the indexes & keys).