I have two tables to JOIN based on BETWEEN condition.
Table 1
is small is small around 1500 records and Table 2
is of 40 millions records. Table1 is having only one column with the datatype bigint
and Table2
with 8 columns. I need to do join between these two tables on BETWEEN condition.
I tried following but its getting slow for just 1 record in Table1
and 40 millions in Table2
.
Query:
SELECT t1.cola AS [InputValue],t2.cola,t2.colb,t2.colc,t2.cold,t2.code
FROM table2 t2
INNER JOIN table1 t1 ON t1.cola BETWEEN t2.cola AND t2.colb ;
Indexing:
CREATE NONCLUSTERED INDEX NCIX_Table1_Cola ON table1(cola)
CREATE NONCLUSTERED INDEX NCIX_Table2_Col_a_b ON table2(cola,colb)
Above query took 30 seconds for just 1 record in table1
and 40 millions in table2
. As said I will be getting more than 1500 records in table1
will getting more slower. Any alternative of between or proper indexing need to be done?
Edit: Added sample data.
Table1:
cola
---------------
12
145
34
90
88990
987611
55
...
..
......1500 rows
Table2:
cola colb colc cold cole
-------------------------------------
0 10 c1 d1 e1
11 20 c2 d2 e2
21 40 c3 d3 e3
41 60 c4 d4 e4
61 100 c5 d5 e5
101 1000 c6 d6 e6
1001 10000 c7 d7 e7
10001 200000 c8 d8 e8
......
......40 millions records
Expected result:
InputValue cola colb colc cold cole
--------------------------------------------------
12 11 20 c2 d2 e2
145 101 1000 c6 d6 e6
34 21 40 c3 d3 e3
.....
SQL Fiddle : Link
Best Answer
I've faced similar issue - the thing is sql doesn't "know" that cola and colb is range and cola of next row will always be bigger than current row colb, so between won't help too much: when it will find first cola that matches between, it will continue checking others also. I would suggest query that uses cross apply to find
max(table2.cola) <= table1.cola
and then to verify addwhere table2.colb >= table1.cola
Something like this