I have a couple of questions regarding working of indexes in PostgreSQL.
I have a Friends
table with the following index:
Friends ( user_id1 ,user_id2)
user_id1
and user_id2
are foreign keys to user
table
-
Are these equivalent? If not then why?
Index(user_id1,user_id2) and Index(user_id2,user_id1)
-
If I create Primary Key(user_id1,user_id2), does it automatically create indexes for it and
If the indexes in the first question are not equivalent, then which index is created on above primary key command?
Best Answer
This answer is about (default) B-tree indexes. See this later, related answer about GiST, GIN etc.:
Here are the results of querying a table on the second column of a multicolumn index.
The effects are easy to reproduce for anybody. Try it at home.
I tested with PostgreSQL 9.0.5 on Debian using a medium sized table of a real-life database with 23322 rows. It implements the n:m relationship between the tables
adr
(address) andatt
(attribute), but that's not relevant here. Simplified schema:The
UNIQUE
constraint effectively implements a unique index. I repeated the test with a plain index to be sure and got identical results as expected.The table is clustered on the
adratt_uni
index and before the test I ran:Sequential scans for queries on
(adr_id, att_id)
are as fast as they can possibly be. The multicolumn index can still be used for a query condition on the second index column alone.I ran the queries a couple of times to populate the cache and the picked the best of ten runs to get comparable results.
1. Query using both columns
Output of
EXPLAIN ANALYZE
:2. Query using first column
Output of
EXPLAIN ANALYZE
:3. Query using second column
Output of
EXPLAIN ANALYZE
:4. Disable indexscan & bitmapscan
Output of EXPLAIN ANALYZE:
Output of
EXPLAIN ANALYZE
:Conclusion
As expected, the multi-column index is used for a query on the second column alone.
As expected, it is less efficient, but the query is still 3x faster than without the index.
After disabling index scans, the query planner chooses a bitmap heap scan, which performs almost as fast. Only after disabling that, too, it falls back to a sequential scan.
See other answer for the original quote from the manual.
Updates since Postgres 9.0
Everything still basically true in Postgres 13. Most notable changes:
INCLUDE
keyword in Postgres 11All in favor of index performance. (Sequential scans got faster, too, though.)