Postgresql – Why Postgres uses a custom defined index instead of the default one on the primary key

execution-planindexpostgresql

I have a table called alpha with many columns, the primary key is just alpha_id. I have also an index (super_index) on that table that uses alpha_id and status, which is just another column.

I run a query against it,

EXPLAIN SELECT alpha_id,
       name,
       descr,
       data,
       to_char(past_date, 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"'),
       status,
       failure,
       FROM job
WHERE alpha_id = 'example_value';

and despite the fact that Postgres has created automatically an index for the primary key

pk_alpha

and that my only condition is using it, this one is never used, and super_index is always visited. Here is also the result of the explain command,

Index Scan using super_index on alpha  (cost=0.14..8.16 rows=1 width=798)      
Index Cond: ((alpha_id)::text = 'example_value'::text)

I have read that all the decisions are based on statistics, but I do not understand the logic behind this one.

Additional info

The table and super_index are actually created using Liquibase, but this is what I get using pgAdmin,

    CREATE TABLE public.alpha
(
    alpha_id character varying(48) COLLATE pg_catalog."default" NOT NULL,
    name character varying(255) COLLATE pg_catalog."default",
    descr text COLLATE pg_catalog."default",
    data text COLLATE pg_catalog."default",
    past_date timestamp without time zone NOT NULL,
    status alpha_status NOT NULL DEFAULT 'Suspended'::alpha_status,
    percentage_complete double precision NOT NULL DEFAULT 0.00,
    failure text COLLATE pg_catalog."default",
    CONSTRAINT pk_alpha PRIMARY KEY (alpha_id)
)

CREATE INDEX super_index
    ON public.alpha USING btree
    (alpha_id COLLATE pg_catalog."default", alpha_status)
    TABLESPACE pg_default;

pg_alpha is the index created by Postgres by default on the primary key, and I do not have it listed in the Indexes list on pgAdmin, so I do not know how to retrieve its definition.

Best Answer

Due to padding for memory alignment, adding a small column to an index can often take up no extra room. The index might even be smaller, if it is fresher and so more densely packed. And even if not, the cost estimate for looking up one row in an index is very weakly dependent on index size, so the cost is likely to be a tie between the two indexes. When there is a tie between indexes they are broken arbitrarily, and it seems like it is the one created most recently that is usually chosen.

Related Solutions

Mysql – Are two indexes needed

An index can seek by a subset of characters, as long as you're searching from the left. E.g., "Inter%" can seek, "%net" will not.

However, the first character is not necessarily the character under which the article would be sorted. "The Internet" should go under "I", not "T". You probably need two fields, DisplayTitle and SortTitle; a single-character index on the latter may be worthwhile, but most likely a full-length index will be just fine.

Indexes are typically B-trees, and a seek will jump to the right location about equally quickly whether you have 10 or 100 entries per page. Scans are another matter, but I'd start with the simplest solution and add an extra index only if performance proves inadequate in practice.

Mysql – InnoDB – Use combined index with primary key on GROUP BY

Because you only select columns from table_1 and because table_2 is only joined (on the right side) of the LEFT JOIN and none of its columns is used elsewhere (SELECT, WHERE, GROUP BY or ORDER BY clauses), you can completely remove the LEFT JOIN table_2 t2 on t1.id = t2.refid part.

This might have given less rows in the result but because you group by the Primary key of table_1 after the join, there is no such case.

SELECT t1.a 
FROM table_1 t1 
WHERE t1.b = 99
GROUP BY t1.id
ORDER BY t1.c ;

Now, because you group by the Primary key of table_1, which is redundant when you have only one table, you can also remove that part: GROUP BY t1.id

Finally, the query is equivalent to:

SELECT t1.a 
FROM table_1 t1 
WHERE t1.b = 99
ORDER BY t1.c ;

which should use the index on (b, c), which yes, includes the primary key. You may consider it to be (b, c, id) if you want.

But whether that index will be used is not 100% sure. Depending on the selectivity of column b (what percent of the whole table has b=99?) this index may or may not be used. If there a lot of rows with b=99 (a large percentage), the optimizer may choose to scan the whole table instead and do a filesort, than use the index, select those (99,c) combinations that exist and then hit the table to find the a values.

If you have an index on (b, c, a) or on (b, c, id, a), the query will be able to find all the info that is needed in this index and in the correct order, so it will use it.

Best Answer

Related Solutions

Mysql – Are two indexes needed

Mysql – InnoDB – Use combined index with primary key on GROUP BY

Related Question