I have a PostgreSQL table with, among the other, two columns named col1
and col2
, both of integer type (there are around 10M rows in the table). I want to perform SQL queries like:
SELECT * FROM table WHERE col1 >= val1 AND col2 <= val2;
(for certain val1
and val2
that I know a query time).
If I put btree indices on col1
and col2
PostgreSQL tries to execute the query performing an index scan on one of the two columns and then filtering on the other. This means that in most cases it has to sweep through around half of the table, even when the number of matching rows is very little. Adding a multicolumn index is useless, because PostgreSQL can effectively use it only when at least one of the two columns is tested for equality.
One important assumption that I can make on the values, though, is that the two columns are monotonic one respect to the other. This means that if in a row col1
is greater then or equal two col1
in another row, then the same relation is valid between the two corresponding col2
entries.
This means that in line of principle the query execution could be sped up by performing an index scan on one of the two columns, filtering on the other and stopping the execution as soon as a non matching value is found on the second column. In this case the query would read just exactly the rows to be returned.
Is there any way to setup indices or whatever other invariant in PostgreSQL so that the query planner is able to detect this?
(of course the problem can be easily solved performing two queries, the first one to translate the inequality on col2
to an inequality on col1
; I am asking if there is a way to avoid this workaround and let PostgreSQL manage the mess by itself)
Best Answer
In which case you can reformulate your query to look like:
because you can find the lower bound for
col2
from the lower bound forcol1
, like this:schema:
method:
clean up: