Suppose we create a partial index in Postgres that avoids useless NULL data:
CREATE INDEX my_ix ON my (col1) WHERE col1 IS NOT NULL;
Should I write the SELECT query as:
SELECT * FROM my WHERE col1 = 'abc';
or as:
SELECT * FROM my WHERE col1 = 'abc' AND col1 IS NOT NULL;
In case of JDBC connection libraries it may look like this:
SELECT * FROM my WHERE col1 = ?;
and I am not sure if Postgres can infer index applicability without:
AND col1 IS NOT NULL
Also it is interesting if Postgres can infer transitivity, for index:
CREATE INDEX my_ix ON my (col1) WHERE col1 > 0;
I expect I don't need to write:
SELECT * FROM my WHERE col1 > ? AND col1 > 0;
if parameters are 10, 23, etc (> 0).
Best Answer
You can easily try it out with the following script:
Here is an SQL fiddle that tuns the code.
Explanations:
With a prepared statement, PostgreSQL can cache query plans for the database session. Using such a generic plan that stays the same regardless of the parameter values has the advantage that plan time is saved.
But PostgreSQL doesn't always use a generic plan, because sometimes generating a custom plan that respects the parameter value can be better.
To decide what to do, PostgreSQL uses the following heuristic:
The first five executions will always use a custom plan.
If the estimated cost of the custom plans is not cheaper than the estimated cost of the generic plan, PostgreSQL will use the generic plan from the sixth execution on.
From PostgreSQL v12 on, you can configure the behavior with the
plan_cache_mode
configuration parameter.Why does the first statement use a generic plan?
For the first statement, an index scan can always be used, even if the argument is
NULL
(because nothing has to be done in this case) . So PostgreSQL will use the generic plan after the fifth execution.You can recognize the generic plan from the
$1
in theEXPLAIN
output.Why does the second statement continue to use a custom plan?
With a generic plan, PostgreSQL can only use a sequential scan, because the index can only be used for some parameter values. For some of the first five executions the custom plan is significantly cheaper than the generic plan, so PostgreSQL continues using custom plans.
To answer your question:
If it is necessary to add an extra
WHERE
condition that makes sure that PostgreSQL knows it can use an index depends on the individual case.You can add the extra condition without a disadvantage, it might help the optimizer and cannot do harm.
But even with the extra condition you cannot be certain that the index will be used.