Postgresql – Should I create an index for non key columns

indexpostgresql

I have a table in a PostGreSql database defined as following:

CREATE TABLE public."MATCH"(
    "ITEM_A_ID" bigint DEFAULT 0,
    "ITEM_B_ID" bigint DEFAULT 0,
    "OWNER_A_ID" bigint DEFAULT 0,
    "OWNER_B_ID" bigint DEFAULT 0,
    "OTHER_DATA" varchar(100) NOT NULL DEFAULT ''
    CONSTRAINT "MATCH_PK" PRIMARY KEY ("ITEM_A_ID","ITEM_B_ID")
);

It will contain a lot of rows. There will be a lot of queries like the following performed on this table:

SELECT * FROM "MATCH" WHERE "OWNER_A_ID" = owner_a_id;
SELECT * FROM "MATCH" WHERE "OWNER_B_ID" = owner_b_id;

I was thinking about creating indexes on OWNER_A_ID and OWNER_B_ID, since these columns are not keys. Is this a good idea, and if yes, how should I create these? Should I create one index with both columns? Should I create two indexes? Should I include other columns?

Best Answer

The selection of enough indexes is often difficult. In your case it should be useful to create two indexes.

You should only create one index with both columns if your query always include the first column as a condition:

SELECT * FROM "MATCH" WHERE "OWNER_A_ID" = owner_a_id [AND "OWNER_B_ID" = owner_b_id]

The whole B-Tree is built upon the order of columns in the index! You can't fully use a multi-column index on a, b on the following queries:

SELECT * FROM "MATCH" WHERE "OWNER_A_ID" = owner_a_id OR "OWNER_B_ID" = owner_b_id
SELECT * FROM "MATCH" WHERE "OWNER_A_ID" = "OWNER_B_ID"

If you're only using equality checks you might consider a hash index. But postgresql has some disadvantages you should check first.

On other dbms you should consider adding additional columns in the index as data. This would be useful if you query these specific columns and not * because the dbms wouldn't need to feed the data from the table after using the index.

An important factor: indices fragment over the time (unless you aren't performing any insert/update/delete on the table). Please check whether your dba has some optimization operations installed.

Pleae check the documentation for additional options like FILLFACTOR or partial indexes: http://www.postgresql.org/docs/9.3/static/sql-createindex.html