Postgresql – Postgres auto analyze performance

autovacuumpostgresqlpostgresql-9.4

On postgresql 9.4 I have a table that never has any updates or deletes, only inserts and selects. It therefore never gets auto vacuumed, but it does still get auto analyzed and when it does it can take over 100 seconds. For that 100 seconds and for a significant amount of time afterwards DB performance is terrible, with my app reporting db response times in the 40-150 second range instead of the normal 2ms range.

I've always heard that you shouldn't disable auto vacuum but when I set autovaccum_enabled=false for this table the db performs much more reliably. Is there a reason to have it enabled on this table? Will the query planner eventually suffer from not having auto analyze running on a table that never gets updates or deletes? Is there a better way to solve this?

EDIT: Potentially modifying autovacuum_analyze_threshold and autovacuum_analyze_scale_factor for this table to make analyze happen more frequently would be better?

Best Answer

I think your best option is to use your favorite OS scheduling tool to run ANALYZE <table> explicitly and pre-emptively at the time of your choosing, say at 3 a.m.

PostgreSQL's autovacuum and autoanalyze are activity-counter driven. This has the unfortunate effect that the time when the activity counters cross the thresholds is mostly likely to be exactly when the database is most active, which is when you least want those maintenance tasks to run. By running the task preemptively during a off-peak time you will reset the activity counters, so that the auto-versions rarely will find work to do. But in case of something going wrong, they are still there to save your bacon.

Will the query planner eventually suffer from not having auto analyze running on a table that never gets updates or deletes?

This can't be answered with the information you provide. If that table gets nothing but single-row lookups via a unique index, for example, out-of-date statistics are unlikely to be a problem. If it gets complicated analytics queries, it is likely to be much more of a problem.

Related Solutions

Postgresql – Optimizing queries on a range of timestamps (one column)

Two things that are very odd here:

The query select 300k rows from a table with 1M+ rows. For 30 % (or anything over 5 % - depends on row size and other factors) it doesn't typically pay to use an index at all. We should see a sequential scan.

The exception would be index-only scans, which I don't see here. The multicolumn index @Craig suggested would be the best option if you get index-only scans out of it. With lots of updates like you mentioned, this may not work out, in which case you are better off without the additional columns - and just the index you already have. You might be able to make it work for you with more aggressive autovacuum settings for the table. You can adjust parameters for individual tables.
While Postgres is going to use the index I would certainly expect to see a bitmap index scan for that many rows, not a plain index scan, which is typically the better choice for a low percentage of rows. As soon as Postgres expects multiple hits per data page (judging from its statistics on the table) it will typically switch to a bitmap index scan.
- Keep PostgreSQL from sometimes choosing a bad query plan

Judging from that I would suspect that your cost settings are inadequate (and possibly the table statistics, too). You may have set random_page_cost and / or cpu_index_tuple_cost too low, relative to seq_page_cost. Follow the links and read the manual.

Would also fit with the observation that cold cache is large factor, as we worked out in comments. Either you are accessing (parts of) tables that nobody has touched in a long time or you are running on a test system where the cache isn't populated (yet)?
Else you just don't have enough RAM available to cache most of the relevant data in your DB. Consequently, random access is much more expensive than sequential access when data resides in cache. Depending on the actual situation you may have to adjust to get better query plans.

One other factor has to be mentioned for slow response on the first read only: Hint bits. Read details in the Postgres Wiki and this related question:

Why does a SELECT statement dirty cache buffers in Postgres?

Or the table is extremely bloated, in which case an index scan would make sense and I would refer back to CLUSTER / pg_repack in my previous answer that you quoted. (Or just VACUUM FULL) And investigate your VACUUM settings. Those are important with many inserts and updates every day.

Depending on UPDATE patterns also consider a FILLFACTOR below 100. If you mostly update only newly added rows, set the lower FILLFACTER after compacting your table, so that only new pages keep some wiggle room for updates.

Schema

campaign_id is 99%+ NULL and dt_updated is 0% NULL.

Adjust the sequence of columns slightly, to save 8 bytes per row (in the 99% of cases where campaign_id is NULL):

CREATE TABLE traffic (
    uuid_self       uuid not null REFERENCES ... ,
    uuid_partner    uuid not null REFERENCES ... ,
    id              serial PRIMARY KEY,
    impressions     integer NOT NULL DEFAULT 1,
    clicks          integer NOT NULL DEFAULT 0,
    campaign_id     integer,
    dt_created      DATE DEFAULT CURRENT_DATE NOT NULL,
    dt_updated      TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
);

Detailed explanation and links to more:

Configuring PostgreSQL for read performance

To measure:

Measure the size of a PostgreSQL table row

PostgreSQL – Slow Queries and Autovacuum Frequency

vacuum: Clears out deleted records from disk

Not exactly. VACUUM marks rows that aren't visible to any active transaction any more as dead (and ready for re-use). It does not shrink the physical file size that represents the table, except for completely dead / empty pages at the physical end. The manual explains everything and probably better than I could recap here.

analyse: Updates the query planner

It's officially ANALYZE, with Z, but ANALYSE is accepted as alternative spelling, too. And it updates statistics used by the query planner. Again, the manual already provides the best explanation.

In postgres.conf ... my presumption is that even though it's commented out, it should still be on?

That's correct. Again, consider details in the manual.

With that many write operations (a few thousand records a minute) your system is in bad need of regular VACUUM / ANALYZE runs. You have read the manual by now, so you understand the consequences. If your table ...

hadn't been autovacuumed for more than a week.

... then that's bad for multiple reasons. Also consider @Daniel's answer how this may have transpired with a huge table like yours. Or maybe the high load constantly locks the table and never lets VACUUM do its work. Again, it's all documented in the manual. Here is a related case with good answers how to tune settings:

Aggressive Autovacuum on PostgreSQL

Remember that you can have per-table settings (STORAGE parameters) to fine tune for special needs of a special table and leave the rest of the system alone.

If you are mostly updating recently inserted rows, a FILLFACTOR below 100 may be very helpful. You can compact the table (once) with CLUSTER or pg_repack an then set the FILLFACTOR below 100. And for huge tables it may also help to set a higher STATISTICS target for key columns with irregular data distribution.

Also, if old rows are updated seldom, partitioning might be a good solution, to treat old sections differently. It really depends on the complete picture ...

Also, don't forget indices, those can get bloated, too. Only keep indices that you actually need.

To see a count of dead and live tuples:

What is the meaning of n_live_tup and n_dead_tup in pg_stat_user_tables

Best Answer

Related Solutions

Postgresql – Optimizing queries on a range of timestamps (one column)

Schema

PostgreSQL – Slow Queries and Autovacuum Frequency

Related Question