Table students has about 10 million records; ID is indexed.
Student grades has 20 million records – student_id is indexed.
I am querying about 20,000 students by their ids ids with a join between students and parents:
select * from students s left join grades g on s.id=g.student_id
where (s.id IN (s1, s2... s1000)
or (s.id IN (s1001, s1002... s2000)
or (s.id IN (s.2001, s2002...s3000)
//until s20000)
I need to split the INs into multiple batches as IN can only get 1000 or less values.
The query takes about 5 minutes to return. Is there any way I can optimize it?
Thanks!
Best Answer
Data Comes from External Source
The
IN
clause is limited to 1000 values. Don't use it for such searches.Workaround:
SELECT
statement toJOIN
against the GTT.Example
Create the GTT
Insert values into the GTT
and JOIN it in the SELECT statement
Notes
Data from another SQL Statement?
Just include that SQL within the
SELECT
statement.