PostgreSQL – Backup and Restore Size Differences

postgresql

I Have a WordPress database in my postgreSQL, in phppgadmin this database is shown with the size of 273 MB.

I've ran sudo -u postgres pg_dump -Fp -f wordpress.sql wordpress to do the database backup, the result is a file with just 18MB.

I've changed the name of the original database to WordPress2 and Then:

I've created a new database called WordPress and did a restore sudo -u postgres psql wordpress < wordpress.sql and the result is a database with size of 19 MB, there is something missing?

Below is a print of phppgadmin, with wordpress as the imported backup and wordpress2 as the original database.
PHPPHADMIN

AutoVacuum

postgres=# select setting from pg_settings where name = 'autovacuum';
 setting
---------
 on
(1 row)

Indexes Space

postgres=# select pg_size_pretty(sum(pg_relation_size(indexrelid))) from pg_index;
     pg_size_pretty
    ----------------
     2600 kB
    (1 row)

Database Sizes
Original

postgres=# select pg_size_pretty(pg_database_size('wordpress2'));
 pg_size_pretty
----------------
 273 MB
(1 row)

Restored

postgres=# select pg_size_pretty(pg_database_size('wordpress'));
 pg_size_pretty
----------------
 20 MB
(1 row)

I Copied the original database

CREATE DATABASE newdb WITH TEMPLATE wordpress2 OWNER postgres;

And performed a vacuum full in this "newdb"

vacuumdb -f -d newdb -z -v

Result is the "newdb" decreased from 273MB to 17MB

postgres=# select pg_size_pretty(pg_database_size('newdb'));
 pg_size_pretty
----------------
 17 MB
(1 row)

Best Answer

I Will answer my own question,

From: https://stackoverflow.com/questions/13505785/postgresql-database-size-is-less-after-backup-load-on-heroku

The reason of this is its MVCC system. Every time you UPDATE any record in a database it creates another "version" of this record instead of rewriting the previous. This "outdated" records will be deleted by VACUUM process, when there will be no need in them.

I've search in PostgreSQL docs and I found: Unlike traditional database systems which use locks for concurrency control, PostgreSQL maintains data consistency by using a multiversion model (Multiversion Concurrency Control, MVCC). This means that while querying a database each transaction sees a snapshot of data (a database version) as it was some time ago, regardless of the current state of the underlying data. This protects the transaction from viewing inconsistent data that could be caused by (other) concurrent transaction updates on the same data rows, providing transaction isolation for each database session.

The main advantage to using the MVCC model of concurrency control rather than locking is that in MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading.

From: https://stackoverflow.com/questions/924993/postgresql-database-size-increasing If the dead tuples have stacked up beyond what can be accounted for in max_fsm_pages, a regular VACUUM will not be able to free everything. The end result is that the database will grow larger and larger over time as dead space continues to accumulate. Running a VACUUM FULL should fix this problem. Unfortunately it can take a very long time on a large database.

If you're running into this problem frequently, you either need to vacuum more often (autovacuum can help here) or increase the max_fsm_pages setting. When running VACUUM VERBOSE it will tell you how many pages were freed and give you a warning if max_fsm_pages was exceeded, this can help you determine what this value should be. See the manual for more information.

From: postgres backup / restore: restored database much smaller?

Maybe it's just index bloat. VACUUM FULL does not help with index bloat, on the contrary, as said in the doc for 8.4:

The FULL option is not recommended for routine use, but might be useful in special cases. An example is when you have deleted or updated most of the rows in a table and would like the table to physically shrink to occupy less disk space and allow faster table scans. VACUUM FULL will usually shrink the table more than a plain VACUUM would. The FULL option does not shrink indexes; a periodic REINDEX is still recommended. In fact, it is often faster to drop all indexes, VACUUM FULL, and recreate the indexes.

(in more recent versions, this advice has disappeared because VACUUM FULL has been reimplemented differently).

See Routine reindeing and the REINDEX command.

The easiest way to reindex is to connect to the database with the db user that owns it and issue:

REINDEX database test_db;

Ideally, it should be done immediately after VACUUM FULL and the database should shrink to its lowest possible size at this point in time.

Related Solutions

Postgresql backup and restore

pg_dump is not an SQL command. It's a separate command, like psql. So you can't run it from within a psql session, you must run it from a regular shell.

The reason you're not getting an outright error is that you didn't write a semicolon on the end of the line. That's why the prompt changed, too:

postgres=# pg_dump test_site > payroll.dump.out  
postgres-# ls
        ^ <-- See how it goes from = to - ?

That indicates that it's a continuation line, so psql is buffering up a statement and hasn't sent it yet.

PostgreSQL – Fixing runuser Error on CentOS 7 for Postgres 9.4

The /usr/pgsql-9.4/bin/postgresql94-setup script needs to be executed as root.

It detects the data directory parsing the service file.

This is the full command help:

Usage: postgresql94-setup {initdb|upgrade} [SERVICE_NAME]

Script is aimed to help sysadmin with basic database cluster administration.

The SERVICE_NAME is used for selection of proper unit configuration file; For
more info and howto/when use this script please look at the docu file
$README_RPM_DIST.  The 'postgresql'
string is used when no SERVICE_NAME is explicitly passed.

Available operation mode:
  initdb        Create a new PostgreSQL database cluster.  This is usually the
                first action you perform after PostgreSQL server installation.
  upgrade   Upgrade PostgreSQL database cluster to be usable with new
                server.  Use this if you upgraded your PostgreSQL server to
                newer major version (currently from $PREVMAJORVERSION \
to $PGMAJORVERSION).

Environment:
  PGSETUP_INITDB_OPTIONS     Options carried by this variable are passed to
                             subsequent call of \`initdb\` binary (see man
                             initdb(1)).  This variable is used also during
                             'upgrade' mode because the new cluster is actually
                             re-initialized from the old one.
  PGSETUP_PGUPGRADE_OPTIONS  Options in this variable are passed next to the
                             subsequent call of \`pg_upgrade\`.  For more info
                             about possible options please look at man
                             pg_upgrade(1).
  PGSETUP_DEBUG              Set to '1' if you want to see debugging output.

Best Answer

Related Solutions

Postgresql backup and restore

PostgreSQL – Fixing runuser Error on CentOS 7 for Postgres 9.4

Related Question