PostgreSQL VACUUM – Does It Reset Dead Tuple Count?

postgresqlvacuum

In PostgreSQL, will VACUUM reset the dead tuple count (n_dead_tup in pg_stat_all_tables)?

The situation I'm seeing is that vacuum is apparently being run manually, once daily, and the n_dead_tup / n_live_tup data shows that there are a large, and increasing number of dead tuples.

PostgreSQL 9.3 here.

Best Answer

Yes, it resets the value. But it is from the stats collector, not vacuum.

postgres=# create table x (a int);
CREATE TABLE
postgres=# insert into x values (1),(2),(3),(4);
INSERT 0 4
postgres=# delete from x where a%2 =0;
DELETE 2

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_all_tables where relname='x';
-[ RECORD 1 ]-------+----------
relid               | 588915846
schemaname          | public
relname             | x
seq_scan            | 1
seq_tup_read        | 4
idx_scan            |
idx_tup_fetch       |
n_tup_ins           | 4
n_tup_upd           | 0
n_tup_del           | 2
n_tup_hot_upd       | 0
n_live_tup          | 2
n_dead_tup          | 2
n_mod_since_analyze | 6
last_vacuum         |
last_autovacuum     |
last_analyze        |
last_autoanalyze    |
vacuum_count        | 0
autovacuum_count    | 0
analyze_count       | 0
autoanalyze_count   | 0



postgres=# vacuum x;
VACUUM

postgres=# select * from pg_stat_all_tables where relname='x';
-[ RECORD 1 ]-------+------------------------------
relid               | 588915846
schemaname          | public
relname             | x
seq_scan            | 1
seq_tup_read        | 4
idx_scan            |
idx_tup_fetch       |
n_tup_ins           | 4
n_tup_upd           | 0
n_tup_del           | 2
n_tup_hot_upd       | 0
n_live_tup          | 2
n_dead_tup          | 0
n_mod_since_analyze | 6
last_vacuum         | 2016-01-12 16:49:32.248731+08
last_autovacuum     |
last_analyze        |
last_autoanalyze    |
vacuum_count        | 1
autovacuum_count    | 0
analyze_count       | 0
autoanalyze_count   | 0

Related Solutions

Postgresql – Aggressive Autovacuum on PostgreSQL

Eelke is almost certainly correct that your locking is blocking autovacuum. Autovacuum is designed to give way to user activity, deliberately. If those tables are locked, autovacuum cannot vacuum them.

For posterity, however, I wanted to give an example set of settings for hyper-aggressive autovacuum, since the settings you gave don't quite do it. Note that making autovacuum more aggressive is unlikely to solve your problem, however. Also note that the default autovacuum settings are based on running over 200 test runs using DBT2 seeking an optimal combination of settings, so the defaults should be assumed to be good unless you have a solid reason to think otherwise, or unless your database is significantly outside the mainstream for OLTP databases (e.g. a tiny database which gets 10K updates per second, or a 3TB data warehouse).

First, turn on logging so you can check up on whether autovacuum is doing what you think it is:

log_autovacuum_min_duration = 0

Then let's make more autovac workers and have them check tables more often:

autovacuum_max_workers = 6
autovacuum_naptime = 15s

Let's lower the thresholds for auto-vacuum and auto-analyze to trigger sooner:

autovacuum_vacuum_threshold = 25
autovacuum_vacuum_scale_factor = 0.1

autovacuum_analyze_threshold = 10
autovacuum_analyze_scale_factor = 0.05

Then let's make autovacuum less interruptable, so it completes faster, but at the cost of having a greater impact on concurrent user activity:

autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_cost_limit = 1000

There's your full program for generically aggressive autovacuum, which might be apppropriate for a small database getting a very high rate of updates, but might have too great of an impact on concurrent user activity.

Also, note that autovacuum parameters can be adjusted per table, which is almost always a better answer for needing to adjust autovacuum's behavior.

Again, though, it's unlikely to address your real problem.

PostgreSQL – How Much Time Will a Vacuum/Autovacuum Operation Take?

On my PostgreSQL (8.3) I use this trick:

I get table's disk size using pg_total_relation_size() - this includes indexes and TOAST size, which is what VACUUM processes. This gives me the idea of how many bytes the VACUUM has to read.
I run VACUUM on the table.
I find the pid of the VACUUM process (in pg_catalog.pg_stat_activity).
In Linux shell I run while true; do cat /proc/123/io | grep read_bytes; sleep 60; done (where 123 is the pid) - this shows me bytes read by the process from the disk so far.

This gives me rough idea on how many bytes are processed (read) every minute by the VACUUM. I presume that the VACUUM must read through the whole table (including indexes and TOAST), whose disk size I know from the step 1.

I presume that the table is large enough so that the majority of it's pages must be read from disk (they are not present in Postgres shared memory), so the read_bytes field is good enough to be used as a progress counter.

Everytime I did this, the total bytes read by the process was no more than 5% from the total relation size, so I guess this approach may be good enough for You.

Best Answer

Related Solutions

Postgresql – Aggressive Autovacuum on PostgreSQL

PostgreSQL – How Much Time Will a Vacuum/Autovacuum Operation Take?

Related Question