Postgresql – Postgres: Why does pg_stat_get_dead_tuples return 0 when there is still wasted space

disk-spacepostgresql

Using Postgres 11.1, I have been trying to determine/estimate the amount of 'wasted' space in a table that would be recovered by a VACUUM FULL. My plan was to use pg_stat_get_live_tuples (L), pg_stat_get_dead_tuples (D), and pg_total_relation_size (S), then estimate the wasted space as (D / (L+D)) * S.

However, this does not appear to work. While investigating why not, I did the following setup:

CREATE TABLE sam_silly(txt TEXT);
INSERT INTO sam_silly VALUES('one');
INSERT INTO sam_silly VALUES('two');
INSERT INTO sam_silly VALUES('three');
INSERT INTO sam_silly VALUES('four');
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;
INSERT INTO sam_silly SELECT txt || '+' FROM sam_silly;

Then I did an initial count of the space according to these functions. The results were as expected (there are a million i.e. 2^20 rows):

select pg_total_relation_size(c.oid) AS size, pg_stat_get_live_tuples(c.oid) AS live, pg_stat_get_dead_tuples(c.oid) AS dead
FROM pg_class c where relname='sam_silly';
  • 47308800, 1048576, 0

I then deleted half the table:

delete from sam_silly where txt like 'one%' or txt like 'three%';

Immediately repeating the query above gave the results I would expect:

  • 47308800, 524288, 524288

Half the data is alive, half dead.

Then I ran 'analyze' and this is where things got weird:

  • 47316992, 524288, 0

Space went up slightly but now there are no dead rows!

I tried 'vacuum' too:

  • 47316992, 524264, 0

Not much change.

Finally, after VACUUM FULL it behaves as I would expect again:

  • 23519232, 524264, 0

Now there are no dead tuples but also the space has been recovered.

The real database has autovacuum on, so it will probably vacuum the tables I'm interested in. But it looks like vacuuming causes it to set the number of dead tuples to 0 even though there is still a lot of 'wasted' space.

Is there a way to find out/estimate how much space is 'wasted' in this situation, when pg_stat_get_dead_tuples is returning zero even though it has not reclaimed the space used by the previously-reported dead tuples?

Exact version in case relevant: 'PostgreSQL 11.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit'

Best Answer

ANALYZE does not remove dead tuples. It looks like autovac kicked in and removed the dead tuples around the same time as you were doing the ANALYZE, and that is why the dead tuple count fell to zero. If you are careful to prevent that from happening, you will find that an ANALYZE does not reset the dead tuple count.

The space freed up by the autovacuum is available for internal reuse, and can be seen using the pg_freespacemap extension.

select avail,count(*) from pg_freespace('sam_silly') group by 1 order by 1;
 avail | count 
-------+-------
 3,648 |    22
 3,680 | 1,232
 3,712 | 1,153
 3,744 | 1,749
 3,776 | 1,233
 3,808 |   360
 3,840 |    20
 4,384 |     1
(8 rows)

Every page of this table is about half full and available for internal re-use. This is only bloat if you do not intend to re-use that space in the near future for new tuples (including updates of old tuples, which create new versions of the old tuples).

The purpose of the dead tuple count is to let the autovac launcher know when it might be advisable to launch an autovac. Pursuant to that goal, it records the number of tuples which are dead but not yet vacuumed away. And when they are vacuumed away, the vacuum resets that count downward. It is not there to show you bloat, which is why it does a poor job when forced into that purpose.

If you want to query for bloat, you can check out the suggestions at https://wiki.postgresql.org/wiki/Show_database_bloat