Postgresql – postgres table sizing inconsistency

performancepostgresqlpostgresql-performance

I've been doing some experimentation/testing with VACUUM ANALYSE and table sizing.

I've been using the queries here to get a figure for table size, and I've been using this to get approximate rowcount….

SELECT reltuples AS approximate_row_count FROM pg_class WHERE relname = 'TABLE_A'

I have one table with 22 thousand rows sized at 700 Mbytes or approximately 32k per row. The table consists of 4 timestamps, 4 integers, 2 booleans and one 6 character field as a unique primary key.

I have a 2nd table with a very similar structure, albeit only 18 thousand rows, but that occupies 6Mbytes or 333 bytes per row.

Why the enormous discrepancy? 32Kbytes per record for the 1st one, and 333bytes per row for the 2nd one!

The tests were run immediately after doing a VACUUM ANALYSE on each table and no other processes were accessing the database at the time. The only other factor that may be important is that the first table is continuously being updated, whereas the 2nd is largely static, however I would expect that after shutting down all other sessions and running a VACUUM ANALYSE, the way the tables are used wouldn't make any difference.

Best Answer

I would expect that after shutting down all other sessions and running a VACUUM ANALYSE, the way the tables are used wouldn't make any difference.

It's the VACUUM ANALYZE that does not make a difference on the size of tables. Because it does not immediately recover the bloat resulting from old versions of rows, it merely marks these rows are reusable.

From the documentation (VACUUM):

Plain VACUUM (without FULL) simply reclaims space and makes it available for re-use

Only VACUUM FULL would recover that space, by rewriting the table from scratch with only the live rows.