Performance: order of tables in joined statement

join;performancesqlite

I have the following SQL statement, running on a SQLite database on a Windows mobile device.

SELECT 
    table1.uniqueidentifier1, table1.int1, table1.varchar1, 
    table1.decimal1, table1.decimal2 
FROM table1
INNER JOIN table2 On table1.PK = table2.FK
WHERE table2.uniqueidentifier2 IN (uniqueidentifier1,uniqueidentifier2,....)
ORDER BY table1.varchar1

As there are several hundred thousand records in each table and the device isn't really new this takes some time.

Would the performance be better, if I switched the tables, somehow like this:

SELECT 
    table1.uniqueidentifier1, table1.int1, table1.varchar1, 
    table1.decimal1, table1.decimal2 
FROM table2
INNER JOIN table1 On table1.PK = table2.FK
WHERE table2.uniqueidentifier2 IN (uniqueidentifier1,uniqueidentifier2,....)
ORDER BY table1.varchar1

Please note: in the first statement I select from table 1` and join table 2, in the second, it is switched.

Why or why not is it faster?

Best Answer

SQLite automatically chooses the estimated optimal join order; the table order in the query has no effect.

There are two ways to optimize this query. If the filter on uniqueidentifier2 removes most table1 records from the result, then it would be fastest to look up table2 records with matching uniqueidentifier2 values first, then to look up the corresponding table1 records, and then to sort the result. This would require the following indexes:

CREATE INDEX t2_uid2_idx ON table2(uniqueidentifier2);
-- CREATE INDEX t1_pk ON table1(PK);  -- primary keys have this automatically

If most table1 records will show up in the result, then it would be more efficient to go through table1 in the proper order and look up correspondig table2 records. This would require the following indexes:

CREATE INDEX t1_vc1_idx ON table1(varchar1);
CREATE INDEX t2_FK_uid2_idx ON table2(FK, uniqueidentifier2);

(Having uniqueidentifier2 in the second index optimizes for this particular query, but might be not worth the storage and update overhead if you have many other queries.)


To check how queries are actually implemented, execute EXPLAIN QUERY PLAN.