You asked:
how postgreSQL will handle the recovery with a pg_data content containing some files which are inconsistent.
pg_start_backup()
ensure the data file is at least as new as the checkpoint. On recovery, the logs are applied.
If the data is old, the log will update it..
If the data is new, the log will have same content. There is no hurt writing it again.
The data are never newer then the log, because the logs are write ahead (WAL).
You asked:
... xfs-freeze
...
xfs-freeze
is alike to pg_start_backup()
, it don't take a snapshot. You need a volume manager to do that.
You asked:
... why do create tablespace & create database statements are unsupported if the WAL can replay everything?
It is supported, just some little gotcha. See http://www.postgresql.org/docs/8.1/static/backup-online.html :
23.3.5. Caveats
CREATE TABLESPACE commands are WAL-logged with the literal absolute
path, and will therefore be replayed as tablespace creations with the
same absolute path. This might be undesirable if the log is being
replayed on a different machine. It can be dangerous even if the log
is being replayed on the same machine, but into a new data directory:
the replay will still overwrite the contents of the original
tablespace. To avoid potential gotchas of this sort, the best practice
is to take a new base backup after creating or dropping tablespaces.
If I recall correctly, PostgreSQL now zeroes the end of the WAL segment after an archive timeout, before archiving the file. (Update): This is true for 9.4; for prior versions you still need to use pg_clearxlogtail
utility to zero the end of any WAL segment as part of your archive script before you gzip
your archive files.
In 9.4 and above you just need to gzip
the file, with no need for pg_clearxlogtail
.
The change was made commit 9a20a9b, but it's only in tag REL9_4_BETA1
, so the change will first appear in 9.4:
commit 9a20a9b
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon Jul 8 11:23:56 2013 +0300
Improve scalability of WAL insertions.
This patch replaces WALInsertLock with a number of WAL insertion slots,
allowing multiple backends to insert WAL records to the WAL buffers
concurrently. This is particularly useful for parallel loading large amounts
of data on a system with many CPUs.
This has one user-visible change: switching to a new WAL segment with
pg_switch_xlog() now fills the remaining unused portion of the segment with
zeros. This potentially adds some overhead, but it has been a very common
practice by DBA's to clear the "tail" of the segment with an external
pg_clearxlogtail utility anyway, to make the WAL files compress better.
With this patch, it's no longer necessary to do that.
BTW, you should really be using an archive timeout, rather than manually switching xlogs. Consider having a streaming replica for additional protection.
Update: As Jeff noted, pg_clearxlogtail
hasn't been updated for 9.3. The incompatibility is very simple though; the definition of XLogRecPtr
changed from
typedef struct XLogRecPtr
{
uint32 xlogid; /* log file #, 0 based */
uint32 xrecoff; /* byte offset of location in log file */
} XLogRecPtr;
to
typedef uint64 XLogRecPtr;
when support for platforms with compilers that don't support 64-bit long integers was removed from PostgreSQL in 9.3. So all we need to do is add a macro or function that tests equality in a way that works for both.
The following patch adds support for 9.3, but is completely untested so use entirely at your own risk. If it eats your data then sets fire to your house, well, that's how things are. Patch here, as an attachment to the linked pgfoundry issue.
Best Answer
Short version: no. PostgreSQL doesn't support generating no WAL whatsoever, because changes to system catalog tables in the
pg_catalog
schema are always logged. There's also transaction ID logging (pg_clog
), the multixact tracking data (pg_multixact
), etc, in addition to the xlog, but they tend to be very small.As Daniel says, you can run with minimal WAL generation by:
wal_level = minimal
UNLOGGED
orTEMPORARY
tablesIn this case, you will also want to set
fsync = off
,full_page_writes = off
andsynchronous_commit = off
, since you have no requirement for durability and crash recovery.Of course, if anything goes wrong (like power loss, unplanned restart, DB server crash,
pg_ctl -m immediate stop
, etc) your data will be totally unrecoverable, but that seems to be what you want.The rate of WAL creation and rotation with this configuration will be negligible for most applications.
If you don't mind having to re-
initdb
after any restart (even a clean one) you could putpg_xlog
on a tempfs, but I doubt it's worth the hassle when Pg will generate only a tiny bit of WAL.You should also check and make sure you don't have WAL archiving (
archive_mode = on
) enabled, thatwal_keep_segments
isn't set. Both of those are only useful if you're doing WAL-based backup or replication. If you don't, then WAL should not accumulate, it should be rapidly recycled. Maybe your checkpoints are set to be much too infrequent, so much so that you run out of disk? Check to see if thecheckpoint_segments
parameter is really really high.I think you're trying to solve the wrong problem here. The issue isn't stopping PostgreSQL from producing WAL, it's figuring out why it piled up in your batch jobs.
See also: Optimizing PostgreSQL for fast testing.