I'm in a bit of a pickle here. I managed to copy the whole postgresql installation directory to a backup drive but how would I retrieve it from Linux with the same architecture and version. Is this possible? I wasn't able to do a backup.
PostgreSQL Backup – How to Backup from Windows Hard Drive to Linux
backuppostgresql
Related Solutions
You asked:
how postgreSQL will handle the recovery with a pg_data content containing some files which are inconsistent.
pg_start_backup()
ensure the data file is at least as new as the checkpoint. On recovery, the logs are applied.
If the data is old, the log will update it..
If the data is new, the log will have same content. There is no hurt writing it again.
The data are never newer then the log, because the logs are write ahead (WAL).
You asked:
...
xfs-freeze
...
xfs-freeze
is alike to pg_start_backup()
, it don't take a snapshot. You need a volume manager to do that.
You asked:
... why do create tablespace & create database statements are unsupported if the WAL can replay everything?
It is supported, just some little gotcha. See http://www.postgresql.org/docs/8.1/static/backup-online.html :
23.3.5. Caveats
CREATE TABLESPACE commands are WAL-logged with the literal absolute path, and will therefore be replayed as tablespace creations with the same absolute path. This might be undesirable if the log is being replayed on a different machine. It can be dangerous even if the log is being replayed on the same machine, but into a new data directory: the replay will still overwrite the contents of the original tablespace. To avoid potential gotchas of this sort, the best practice is to take a new base backup after creating or dropping tablespaces.
The answer will contain two sections - first, what's acceptable to see in the logs after the restore, and second a few examples of what is not. The first section should be fairly deterministic, while the second one is basically a random assortment of whatever happened to us that indicated we had a problem.
Acceptable log output
at the start:
2015-07-23 06:51:24 UTC LOG: database system was interrupted; last known up at 2015-07-23 02:10:42 UTC
It's important to see that the restoring PostgreSQL knows when it was last up. I think that's so because that means it's starting from a checkpoint.
xlog min recovery request ... is past current point
Right at the beginning, a few of these can happen:
2015-07-23 06:51:30 UTC WARNING: xlog min recovery request 1027/B0A28D98 is past current point 1027/2BE36DA8
2015-07-23 06:51:30 UTC CONTEXT: writing block 0 of relation base/117264/9551898_vm
xlog redo insert: rel 1663/117264/8310261; tid 68622/40
But according to http://www.postgresql.org/message-id/CAB7nPqTd43hqpuC+M8fo+xkqHv1WtFe_16NUttu1pHcBtZhZmw@mail.gmail.com that is harmless
FATAL: the database system is starting up
Any number of these can happen:
2015-07-23 06:51:24 UTC FATAL: the database system is starting up
This should actually be harmless because they were in our case the result of automated SELECT 1
ping-like queries that scripts run to check that PostgreSQL is ready.
unexpected pageaddr ... in log file ..., segment ..., offset ...
At the end, there's this:
2015-07-23 06:52:21 UTC LOG: restored log file "0000000100001027000000B2" from archive
2015-07-23 06:52:21 UTC LOG: consistent recovery state reached at 1027/B2F8F2F8
sh: 1: cannot open ../../../wal_backup/0000000100001027000000B3: No such file
2015-07-23 06:52:21 UTC LOG: unexpected pageaddr 1027/AA000000 in log file 4135, segment 179, offset 0
2015-07-23 06:52:21 UTC LOG: redo done at 1027/B2F8F2F8
2015-07-23 06:52:21 UTC LOG: last completed transaction was at log time 2015-07-23 02:17:33.842307+00
But according to http://www.postgresql.org/message-id/CAGrpgQ-BbXUNErrAtToYhRyUef9_GdUQz1T3CXbpTMLTnuKANQ@mail.gmail.com that's also harmless
Note that there may be more of the WAL restorations after that point:
2015-07-23 06:52:21 UTC LOG: restored log file "0000000100001027000000B2" from archive
That would merely mean that you supplied more WAL files via recovery.conf
than strictly necessary.
00000002.history: No such file
At the very end of the WAL unroll process there's this:
sh: 1: cannot open ../../../wal_backup/00000002.history: No such file
2015-07-23 06:52:21 UTC LOG: selected new timeline ID: 2
sh: 1: cannot open ../../../wal_backup/00000001.history: No such file
2015-07-23 06:52:21 UTC LOG: archive recovery complete
This is apparently/hopefully irrelevant, because that's where the restored database (clone) starts a new life (timeline).
Unacceptable log output
at the start:
2015-07-20 12:38:31 UTC LOG: database system was interrupted while in recovery at log time 2015-07-20 01:41:22 UTC
This is critical - it means that the backup process did not start at the right time - after a pg_start_backup(...)
checkpoint - rather that the database was working normally and was at some random point, which means that this restore is more akin to restoring a crashed database.
missing chunk in pg_toast...
This indicates that the restore wasn't right. As a quick fix, we tried the recipe from http://postgresql.nabble.com/select-table-indicate-missing-chunk-number-0-for-toast-value-96635-in-pg-toast-2619-td5682176.html
mydb=# vacuum analyze mytable; -- trigger the error to see the problem toast
ERROR: missing chunk number 0 for toast value 13044178 in pg_toast_2619
mydb=# reindex table pg_toast.pg_toast_2619;
REINDEX
This could sometimes get the table back in a working state, but it would also sometimes not have that effect. After that we poked at it some more, and thought we found it's just pg_statistic which is disposable:
mydb=# reindex table pg_statistic;
ERROR: could not create unique index "pg_statistic_relid_att_inh_index"
DETAIL: Key (starelid, staattnum, stainherit)=(884792, 34, f) is duplicated.
mydb=# delete from pg_statistic;
DELETE 188540
mydb=# reindex table pg_statistic;
REINDEX
mydb=# vacuum analyze mytable;
VACUUM
right sibling's left-link doesn't match
CREATE TABLE "myschema"."mytable" ( ... )
ERROR: right sibling's left-link doesn't match: block 27 links to 21379 instead of expected 21393 in index "pg_depend_reference_index"
We tried to quickly bypass this by doing:
mydb=# set zero_damaged_pages=on;
SET
mydb=# reindex table pg_depend;
REINDEX
mydb=# set zero_damaged_pages=off;
SET
could not read block in file ...
2015-05-12 13:32:53 UTC ERROR: could not read block 76408 in file "pg_tblspc/4606764/PG_9.1_201105231/117264/4614269": read only 0 of 8192 bytes
This was obviously a bummer. We couldn't quickly hack our way around this:
mydb=# select cl.relfilenode, nsp.nspname as schema_name, cl.relname, cl.relkind from pg_class cl join pg_namespace nsp on cl.relnamespace = nsp.oid where relfilenode = 4614269;
relfilenode | schema_name | relname | relkind
-------------+-------------+---------+---------
4614269 | myschema | mytable | r
(1 row)
mydb=# select pg_relation_filepath('myschema.mytable');
pg_relation_filepath
---------------------------------------------------
pg_tblspc/4606764/PG_9.1_201105231/117264/4614269
(1 row)
% sudo ls -lah /var/lib/postgresql/9.1/main/pg_tblspc/4606764/PG_9.1_201105231/117264/4614269
-rw------- 1 postgres postgres 597M May 11 19:22 /var/lib/postgresql/9.1/main/pg_tblspc/4606764/PG_9.1_201105231/117264/4614269
That was a good indicator that too much data was getting "lost".
duplicate key value violates unique constraint "pg_type_typname_nsp_index"
This was another indicator that the restore was broken:
CREATE TABLE "myschema"."mytable" ( ... )
ERROR: duplicate key value violates unique constraint "pg_type_typname_nsp_index" DETAIL: Key (typname, typnamespace)=(mytable_mycolumn_seq, 3780903) already exists.
The quick hack for this was to move the sequence position:
SELECT setval('mytable_id_seq', (SELECT MAX(id) FROM mytable));
Best Answer
There's no way to dump your Windows PostgreSQL database directory using Linux directly1.
You need to start up a virtual machine running Windows, then copy the database directory to that virtual machine. You can then
pg_dump -Fc
it, andpg_restore
the dump to your new blank PostgreSQL install on the Linux machine.You must make sure that if your PostgreSQL data directory was from 64-bit PostgreSQL you install a 64-bit Windows version and install 64-bit PostgreSQL on it.
If your PostgreSQL data dir is 32-bit it doesn't matter whether you install 32-bit or 64-bit Windows, but you must install 32-bit PostgreSQL.
Licensing restrictions mean that you can't just download a virtual machine image for Vagrant or whatever to run Windows. You will need a Windows install DVD or ISO image. It's possible to download these legally from Microsoft's digital distribution system if you don't already have a DVD or ISO. You can install them without a license key, but it'll disable its self after a few days, and it's not legal to use it without possessing a license.
For that reason, unless the database is really huge one of the best options is to sign up to Amazon EC2 (if you haven't already) and launch a Windows Server virtual machine on EC2. You can install PostgreSQL then copy your database to that virtual machine, run
pg_dump
, and copy the dump back to your computer. Then just shut down the VM. You can use EC2 for free if you use a micro instance.1 It's possible you might be able to run PostgreSQL using WINE, but I suspect that'd be even more complicated, if it works at all, and probably not particularly trustworthy. So just use Windows.