Answer
Since you refer to the website use-the-index-luke.com
, consider the chapter:
Use The Index, Luke › The Where Clause › Searching For Ranges › Greater, Less and BETWEEN
It has an example that matches your situation perfectly (two-column index, one is tested for equality, the other for range), explains (with more of those nice index graphics) why @ypercube's advice is accurate and sums it up:
Rule of thumb: index for equality first — then for ranges.
Also good for just one column?
What to do for queries on just one column seems to be clear. More details and benchmarks concerning that under these related question:
Less selective column first?
Apart from that, what if you have only equality conditions for both columns?
It doesn't matter. Put the column first that is more likely to receive conditions of its own, which actually matters.
Consider this demo, or reproduce it yourself. I create a simple table of two columns with 100k rows. One with very few, the other one with lots of distinct values:
CREATE TEMP TABLE t AS
SELECT (random() * 10000)::int AS lots
, (random() * 4)::int AS few
FROM generate_series (1, 100000);
DELETE FROM t WHERE random() > 0.9; -- create some dead tuples, more "real-life"
ANALYZE t;
SELECT count(distinct lots) -- 9999
, count(distinct few) -- 5
FROM t;
Query:
SELECT *
FROM t
WHERE lots = 2345
AND few = 2;
EXPLAIN ANALYZE
output (Best of 10 to exclude caching effects):
Seq Scan on t (cost=0.00..5840.84 rows=2 width=8)
(actual time=5.646..15.535 rows=2 loops=1)
Filter: ((lots = 2345) AND (few = 2))
Buffers: local hit=443
Total runtime: 15.557 ms
Add index, retest:
CREATE INDEX t_lf_idx ON t(lots, few);
Index Scan using t_lf_idx on t (cost=0.00..3.76 rows=2 width=8)
(actual time=0.008..0.011 rows=2 loops=1)
Index Cond: ((lots = 2345) AND (few = 2))
Buffers: local hit=4
Total runtime: 0.027 ms
Add other index, retest:
DROP INDEX t_lf_idx;
CREATE INDEX t_fl_idx ON t(few, lots);
Index Scan using t_fl_idx on t (cost=0.00..3.74 rows=2 width=8)
(actual time=0.007..0.011 rows=2 loops=1)
Index Cond: ((few = 2) AND (lots = 2345))
Buffers: local hit=4
Total runtime: 0.027 ms
The expression lower(name::text)
is not covered by your index, which also does not seem to be useful at all, at least for what we see in your question:
CREATE INDEX crm_customer_big_index
ON crm_customer (name ASC,
name text_pattern_ops,
(id::text) text_pattern_ops,
phone text_pattern_ops,
mail text_pattern_ops,
rating);
This index would cover your query:
CREATE INDEX crm_customer_foo_index ON crm_customer (
rating
, lower(name::text) text_pattern_ops
, address_id); -- for the join
To optimize, I suggest to adapt your query additionally:
SELECT *
FROM customer_list
WHERE lower(name::text) LIKE 'env%'
AND rating = 3
ORDER BY lower(name::text), name
LIMIT 20;
Also, if you are only interested in the first few characters of name character varying(100)
, make the index shorter and faster by just indexing those first few characters:
CREATE INDEX crm_customer_foo_index ON crm_customer (
rating
, lower(left(name::text, 7)) text_pattern_ops -- example with local optimum
, address_id); -- for the join
The query must match the index expression:
WHERE lower(left(name::text, 7)) LIKE 'env%'
...
ORDER BY lower(left(name::text, 7)), name
Not sure if the view is in the way somehow. But it should be transparent.
I wouldn't use varchar(n)
at all. Just text
:
About pattern matching and text_pattern_ops
:
Why did I put rating
first in the index?
There are query techniques to better capitalize on non-leading index columns. Consider the Postgres Wiki on "Loose indexscan".
Best Answer
You need an extra pair of parentheses around the indexed term if it is not a column or a function call.
In the words of the documentation:
I think it is a bad idea to use a composite type if you want to use parts of it in a
WHERE
or join condition. It violates the first normal form.