Postgresql – Is it a good idea to index timestamp in postgresql

postgresql

Here is my table definition

-- Table: public.my_table

-- DROP TABLE public.my_table;

CREATE TABLE public.my_table
(
    my_row_id text COLLATE pg_catalog."default" NOT NULL,
    my_item_id text COLLATE pg_catalog."default" NOT NULL,
    my_timestamp timestamp with time zone NOT NULL DEFAULT now(),
    --  ...... some other data
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.my_table
    OWNER to my_db;

The cardinality of the table is about 10-20 million rows

Here is the query I want to speed up

select my_row_id, my_item_id
from my_table
where my_timestamp > timestamp '2018-09-13 00:00:00'
order by my_timestamp desc

I am tempted to just index the timestamp, but I worry that since the precision of timestamp is millisecond, this means the the cardinality of the index itself will be almost as large as the cardinality of the table (eg. the number of rows will be equal to the number of index entry)

Should I have custom index that index with the precision of a day?

Like store an extra column text 2018-08-17 from timestamp with timezone 2018-08-17 07:06:43.508778

And index the text instead?

Best Answer

For the query you give, the optimal general index would be on (my_timestamp, my_row_id, my_item_id), with the order of the last two columns possibly switched. This will allow an index-only scan to be used which possibly doesn't visit the table at all, since all columns you reference would be available in the index. How much of a performance improvement this would give you depends on how much of your table meets the timestamp WHERE condition, and how well-vacuumed you keep the table. You might do even better with a partial index if the timestamp hardcoded into your query never changes, or changes in a very rote manner.

Like store an extra column text 2018-08-17 from timestamp with timezone 2018-08-17 07:06:43.508778

There is little reason to think that this would be helpful in general, and certainly wouldn't help with the query you show (you would have to rewrite the query before it could use such an index, and then what happens if the timestamp is exactly '2018-09-13 00:00:00' ?)