Postgresql – Pre Caching Index on a large table in PostgrSQL

cacheindexpostgresqlpostgresql-12

I have a table with about 10mln rows in it with a primary key and an index defined on it:

create table test.test_table(
    date_info date not null,
    string_data varchar(64) not null,
    data bigint
    primary key(date_info, string_data));
    create index test_table_idx 
    on test.test_table(string_data);

I have a query that makes the use of the test_table_idx:

select distinct date_info from test.test_table where string_data = 'some_val';

The issue is that first time around it could take up to 20 seconds to run the query and < 2 seconds on any subsequent runs.

Is there a way to pull load the entire index into memory rather then have DB load information on first access?

Best Answer

You could use the additional module pg_prewarm. Has to be installed once per database. See:

It can "prewarm" tables as well as indexes. To do it for your index:

SELECT pg_prewarm('test.test_table_idx');

Unless you get index-only scans (which you do not with the index at hand), you might want to prewarm the table as well:

SELECT pg_prewarm('test.test_table');

There are more parameters to narrow down what and how to prewarm. Follow the links.

This is costly and the system might be better off using the cache for something else. If you know the exact query ahead of time and it's a SELECT without side effects, you might just run the query to "prewarm" relevant data pages of index and table.

Aside, you might be better off rearranging PK and index like this:

    ...
    primary key(string_data, date_info);
    create index test_table_idx on test.test_table(date_info);

Now, the PK index can give you index-only scans for the query at hand. Might make a substantial difference. See: