Disk Space Not Released After Cleaning Up Rows in PostgreSQL 9.3

postgresqlpostgresql-9.3

I understand space isn't abruptly released (unless one does TRUNCATE). But this one looks abnormal to me.

What could be the reason for a postgresql (9.3.10) table not releasing space after having 70% of its records removed?

I have a table called user_sessions_session. As the name implies, it stores sessions data for each user of a web app.

It's original size was:

              Table               |  Size   | External Size 
----------------------------------+---------+---------------
 user_sessions_session            | 15 GB   | 13 GB

I thereafter deleted all user sessions older than 3 months ago. That was the majority of the rows in the table. This was 3 days ago. I just checked the table size again, here's what I see:

              Table               |  Size   | External Size 
----------------------------------+---------+---------------
 user_sessions_session            | 15 GB   | 13 GB

Moreover, select * from pg_stat_activity where query like 'autovacuum:%'; reveals no vacuuming is going on at this point.

Btw I've had this same issue with this same sessions table before – it's not a one off.


In case it matters, here's the SQL I used to get table sizes (user_sessions_session showed up as number 1 in the list):

SELECT                                                         
   relname as "Table",
   pg_size_pretty(pg_total_relation_size(relid)) As "Size",
   pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as "External Size"
   FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;

Best Answer

From the docs,

In PostgreSQL, an UPDATE or DELETE of a row does not immediately remove the old version of the row. This approach is necessary to gain the benefits of multiversion concurrency control (MVCC, see Chapter 13): the row version must not be deleted while it is still potentially visible to other transactions. But eventually, an outdated or deleted row version is no longer of interest to any transaction. The space it occupies must then be reclaimed for reuse by new rows, to avoid unbounded growth of disk space requirements. This is done by running VACUUM.

At all. Never. There are no exceptions. When ever you delete a row, you essentially mark it as inactive for your snapshot. Other snapshots started by prior transactions can still see it. This means it's still on disk.

Rows aren't stored in their own file on disk. In order to delete that row, even when you want, you have to essentially rewrite the files stored on disk without that row. DELETE does not do this and under normal circumstances that's perfectly fine because dead rows that aren't visible to a transaction can be reclaimed internally by a simply VACUUM. After being marked for reuse new data can be stored in the heap.

So essentially, only operations that internally cause the table's heap to be rewritten reclaim space. Those operations are:

  • CLUSTER
  • VACUUM FULL
  • TRUNCATE
  • ALTER TABLE (forms that rewrite the table)

All of these operations require an ACCESS EXCLUSIVE lock. And, TRUNCATE is such a sledge hammer it even violates MVCC.

@EvanCarroll: and so technically, I can actually go out of diskspace even after having removed a ton of rows here? – Hassan Baig 1 min ago

Yea, really and technically and everything else, but let's make one special caveat to the above. If you DELETE a bunch of rows and commit they get marked as inactive. When you VACUUM the table's heap space gets reclaimed. From that point, to run out of disk space you to either

  • Exhaust that reclaimed space on that table's heap because PostgreSQL will just store stuff in that space.
  • Or, exhaust a different table's heap.

But yes, VERY bad things can happen when you exhaust your disk space. And, they'll happen with every MVCC database vendor. Simply, don't run of disk space.