PostgreSQL – Duration of Reclaiming Disk Space on Very Large Tables

postgresql

I'm using PostgreSQL 9.4.20 and have 1 table which at 1 point contained 2.5 billion rows. We've started migrating data and reduced the table size down to about 280 million rows.

If possible we'd like to reclaim the used disk space.I read up on VACUUM FULL and know that it will result in a full table lock during the operation. My question here is: given that a VACUUM of the table will take about 12000 – 13000 seconds, is there any way to estimate how long a VACUUM FULL will take give the current and previous size of the table?

Included is one of the VACUUM VERBOSE outputs during the deletion process:

INFO:  vacuuming "public.redacted"
INFO:  scanned index "redacted_prop_id_3df0d554fdfe92ee_idx" to remove 85063130 row versions
DETAIL:  CPU 105.78s/340.93u sec elapsed 3633.81 sec.
INFO:  scanned index "redacted_pkey" to remove 85063130 row versions
DETAIL:  CPU 35.37s/269.95u sec elapsed 1147.31 sec.
INFO:  "redacted": removed 85063130 row versions in 811189 pages
DETAIL:  CPU 12.00s/17.68u sec elapsed 342.43 sec.
INFO:  scanned index "redacted_prop_id_3df0d554fdfe92ee_idx" to remove 39116682 row versions
DETAIL:  CPU 101.30s/346.25u sec elapsed 3576.71 sec.
INFO:  scanned index "redacted_pkey" to remove 39116682 row versions
DETAIL:  CPU 40.21s/237.76u sec elapsed 1219.00 sec.
INFO:  "redacted": removed 39116682 row versions in 879306 pages
DETAIL:  CPU 22.85s/40.58u sec elapsed 817.57 sec.
INFO:  index "redacted_prop_id_3df0d554fdfe92ee_idx" now contains 1597286875 row versions in 1
DETAIL:  121144639 index row versions were removed.
4169907 index pages have been deleted, 3055311 are currently reusable.
CPU 0.03s/0.00u sec elapsed 0.30 sec.
INFO:  index "redacted_pkey" now contains 1597377401 row versions in 5999520 pages
DETAIL:  124175137 index row versions were removed.
1209655 index pages have been deleted, 894966 are currently reusable.
CPU 0.01s/0.00u sec elapsed 0.17 sec.
INFO:  "redacted": found 107389052 removable, 820955017 nonremovable row versions in 7045430 oes
DETAIL:  0 dead row versions cannot be removed yet.
There were 13483069 unused item pointers.
0 pages are entirely empty.
CPU 382.60s/1408.62u sec elapsed 12543.69 sec.
INFO:  analyzing "public.redacted"
INFO:  "redacted": scanned 30000 of 15055610 pages, containing 3170455 live rows and 0 dead rosample, 1591104467 estimated total rows

Best Answer

I chose a different approach to reclaiming the diskspace. A lot of data was used for the pkey and prop_id_3df0d554fdfe92ee_idx indexes. It was possible to halt all writes to the table, during which time I issued a REINDEX command for the individual indexes.

After rebuilding the indexes enough diskspace was reclaimed to allow a secondary table to be created based on the contents of the original table. Afterwards the original table was deleted and the second table renamed.