SQLite3 not using covering index with json_extract expression

execution-planindexoptimizationsqlite

I am attempting to create an index in SQLite3 (3.18) using json_extract expressions. My aim is to execute queries that only require the index to yield results. The reason for this is that json_extract is an expensive operation which would hinder performance when operating on larger data sets and/or values. I concluded I need a covering index to suit my needs.

Step 1 – Testing the theory using a normal table structure

CREATE TABLE Player (
    Id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    FirstName TEXT NOT NULL,
    MiddleName TEXT,
    LastName TEXT NOT NULL
);

CREATE INDEX Player_FirstName ON Player (
    FirstName ASC,
    LastName ASC
);

EXPLAIN QUERY PLAN SELECT
    FirstName, LastName
FROM
    Player
WHERE
    LENGTH(LastName) > 10
ORDER BY
    FirstName
LIMIT
    10
OFFSET
    0

Yields

SCAN TABLE Player USING COVERING INDEX Player_FirstName

This is exactly what I expect. The query planner figured that the Player_FirstName index is appropriate due to the ORDER BY clause, and since the WHERE statement operates only on a value that is also in that index, it doesn't need to read the table. Finally, the SELECT statement includes only the indexed columns therefore resulting in a query that doesn't touch the table at all.

Step 2 – Testing the theory with an extract expression

CREATE TABLE PlayerJ (
    Id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    Data TEXT NOT NULL
);

CREATE INDEX PlayerJ_FirstName ON PlayerJ (
    JSON_EXTRACT(Data, '$.FirstName') ASC,
    JSON_EXTRACT(Data, '$.LastName') ASC
);

EXPLAIN QUERY PLAN SELECT
    JSON_EXTRACT(Data, '$.FirstName') AS FirstName,
    JSON_EXTRACT(Data, '$.LastName') AS LastName
FROM
    PlayerJ
WHERE
    LENGTH(LastName) > 10
ORDER BY
    FirstName
LIMIT
    10
OFFSET
    0

Yields

SCAN TABLE PlayerJ USING INDEX PlayerJ_FirstName

This is not what I expected. The query planner seems to have figured out that the ORDER BY clause is on JSON_EXTRACT(Data, '$.FirstName'), and thus seems to have selected the appropriate index. But that is where my reasoning ends abruptly. What is going on here? I would have expected the query planner to figure out that this is the same as the previous test, and the index would be used as a covering index. But it doesn't.

Why not? And how can this second test be modified so that it runs only against the index?

Best Answer

The documentation says:

The SQLite query planner will consider using an index on an expression when the expression that is indexed appears in the WHERE clause or in the ORDER BY clause of a query.

So expressions in the SELECT clause will not use the expression index.

Using a covering index is not as much an improvement over using a normal index for searching/sorting as using a normal index would be over using no index at all, so this optimization has not (yet?) been implemented for expression indexes.