Why does a SELECT with no WHERE cause a full table scan, (or does it)

optimizationsqlite

I have the following table with ~2000000 rows

CREATE TABLE TableA (idA integer PRIMARY KEY, idB TEXT UNIQUE)

From time to time, I need to process each rows, the order does not mater.

So I process 200 rows at a time, (around a transaction)

SELECT iaA, idB FROM TableA  ORDER BY ROWID asc LIMIT 200

but when I checked the query with EXPLAIN QUERY PLAN I noticed that it was doing a table scan.

0    0    0    SCAN TABLE TableA USING INTEGER PRIMARY KEY (~1000000 rows)

I am especially curious about the ~1000000 rows

It does not matter if I select 1 row or 1000 rows, I still do a table scan with ~1 million rows.

Is this not the most efficient way of querying the table for 'any' data?

As I said, the order does not matter, the data has to be processed.

Edit: If I run the query with no ORDER BY

SELECT * FROM TableA LIMIT 200

I get fairly similar results.

SCAN TABLE TableA USING COVERING INDEX sqlite_autoindex_TableA_1 (~1000000 rows)

Best Answer

EXPLAIN QUERY PLAN shows only an estimate, and the LIMIT clause is ignored when doing this estimation. (And this number is so misleading that recent versions of SQLite do not show it.)

A plain SELECT * FROM TableA LIMIT x is the fastest way to get x rows from a table. SQLite computes result rows on demand, so it will not read the entire table but stop scanning after x rows.

Table scans are likely to be inefficient when you are searching for a single row or a few rows, or when you have a filter that greatly reduces the number of rows to be returned. But when the query is going to return all rows anyway, it is not possible to reduce the number of rows that actually need to be read from disk, so the best way to reduce disk I/O is to avoid touching anything but the table itself. (In this case, a covering index can replace a table.)