I did a pg_dump
on a JIRA database that I was hosting in a PostgreSQL 8.3 server. Size of the database after vacuum full
was 217132652
(approximately 207 MB).
Then I restored that JIRA database on a PostgreSQL 9.4 server with the following command:
$ psql -X -v ON_ERROR_STOP=1 -d jira2 -U jira -h localhost < jiradb2017_03_12.sql
I assume the restore would exit on any error since I used ON_ERROR_STOP=1
, but the SQL script finished correctly (despite some warnings not related to data restore).
I ended up with a database with a size of 158019348
(approximately 151 MB).
So, what's the story here? Can I just assume database was restored successfully and PostgreSQL optimized its storage (somewhere between 8.3 and 9.4 versions) engine and is using space more efficiently?
Best Answer
When you restore a database you have all the information on it packed, with no empty space between rows (or in indices), unless some specific settings are in place (basically:
FILLFACTOR
for tables andFILLFACTOR
for indices).On the other hand, when your database has been in use for some time, and you've had your share of inserts, updates and deletes, free unused space will appear. This is because of the way PostgreSQL and Multiversion Concurrency Control, a.k.a. MVCC work. MVCC allows for less lockings, which basically means you save time. But you pay a price in terms of space:
UPDATE
is equivalent to anINSERT
together with aDELETE
, with the overhead (at least in terms of space used) associated to both.INSERT
ing,UPDATE
ing orDELETE
ing, you have simultaneously several copies of every row involved.Autovacuum takes care of this space being made reusable by default, or you could have some specific procedure for routine vacuuming.
This fact already can explain the size change.
Optimizations between versions probably also took place; and can explain further improvements. Optimizations could also have been made for speed and not for size, and the actual size could actually grow from one version to the next. I really don't know the specifics to be able to tell; although the comment from @Erwin states that both changes making your tables shrink and changes making your tables bloat (grow) have taken place since version 8.3.
To distinguish between the two effects, if you're curious, you could just, as @Jack Douglas suggests, restore your database on 8.3. It will most probably shrink in size. If it shrinks to less than 151 MB (a size smaller that what you get with 9.4 version), then the removal of unused space made your DB shrink, and the version change actually made your DB grow.
For a better understanding of MVCC, look at Bruce Momjian's presentation.