Postgresql – Postgres 10.3 : table grows in size after delete-insert

database-sizedeleteindex-bloatinsertpostgresql-10

I'm a rather new to Postgres and the issue I see seems a bit strange. Maybe you as experts know something about it.

I have a table (around 3 million rows and 200 columns). In order to populate fresh data this table gets dropped and fully reloaded on a regular basis. I didn't like this approach with full reload so I changed the process a little bit: now the records that require recalculation are deleted and loaded from scratch – other data stays intact. This approach is significantly faster than the full reload, however after deletion and insertion the table bloats horribly (3-5 times depending on the number of records deleted-inserted).

I do understand that the data is not physically removed when DELETE statement is executed. I know that dead tupples remain. My expectation was that if I dropped all the indexes before deleting data, partially deleted data, did VACUUM ANALYZE, reloaded the data and built indexes once again (exactly in this sequence) neither the table itself nor indexes would bloat. But apparently, I was wrong.

Is there anything substantial that I am missing here?

UPDATE: Funny thing is that "SELECT relname, n_dead_tup FROM pg_stat_user_tables WHERE relname = 'table_name' " shows that there are no dead tuples. Now I'm puzzled even more.

UPDATE 2: I figured why the reason behind the bloating. There's another step after the "delta" reload that updates a couple of columns for the whole table (because they can't be loaded in delta mode). This update makes the table grow in size.

Best Answer

VACUUM will only shrink the data file if the free space happens to be at the end. What you probably want is a VACUUM FULL. However be aware of a couple of caveats:

  1. It will lock the table while it runs (which may be a while)
  2. It will require equivalent temporary space to the table

But when it is finished, the final space used will be just enough for the actual data you have.