You asked:
how postgreSQL will handle the recovery with a pg_data content containing some files which are inconsistent.
pg_start_backup()
ensure the data file is at least as new as the checkpoint. On recovery, the logs are applied.
If the data is old, the log will update it..
If the data is new, the log will have same content. There is no hurt writing it again.
The data are never newer then the log, because the logs are write ahead (WAL).
You asked:
... xfs-freeze
...
xfs-freeze
is alike to pg_start_backup()
, it don't take a snapshot. You need a volume manager to do that.
You asked:
... why do create tablespace & create database statements are unsupported if the WAL can replay everything?
It is supported, just some little gotcha. See http://www.postgresql.org/docs/8.1/static/backup-online.html :
23.3.5. Caveats
CREATE TABLESPACE commands are WAL-logged with the literal absolute
path, and will therefore be replayed as tablespace creations with the
same absolute path. This might be undesirable if the log is being
replayed on a different machine. It can be dangerous even if the log
is being replayed on the same machine, but into a new data directory:
the replay will still overwrite the contents of the original
tablespace. To avoid potential gotchas of this sort, the best practice
is to take a new base backup after creating or dropping tablespaces.
The difference is that what you call "standard commands" have implicit transactions (as in "not explicit" and not real implicit transactions which mean something different), so every time you issue an INSERT
command without an explicit transaction, it will open a transaction, insert the data and automatically commit. This is called an autocommit transaction.
This is also why you can't rollback this INSERT
: it's already committed. So the rule is the same as explicit transactions: you can't rollback once they've been committed.
You can see what I mean directly from inside SQL Server.
Microsoft ships SQL Server with a DMF called sys.fn_dblog
that can be used to look inside the transaction log of a given database.
For this simple experiment I'm going to use the AdventureWorks database:
USE AdventureWorks2008;
GO
SELECT TOP 10 *
FROM dbo.Person;
GO
INSERT INTO dbo.Person (FirstName, MiddleName, LastName, Gender, Date)
VALUES ('Never', 'Stop', 'Learning', 'M', GETDATE());
COMMIT;
BEGIN TRAN;
INSERT INTO dbo.Person (FirstName, MiddleName, LastName, Gender, Date)
VALUES ('Never', 'Stop', 'Learning', 'M', GETDATE());
COMMIT;
GO
SELECT *
FROM sys.fn_dblog(NULL, NULL);
GO
Here I'm doing two inserts: one with and one without an explicit transaction.
In the log file you can see that there's absolutely no difference between the two:
The red one is the INSERT
within an autocommit transaction and the blue one is the INSERT
with an explicit transaction.
As for the 3rd party tools you mention, yes they analyse the database log and generate normal T-SQL code to "undo" or "redo" the operations. By normal I mean they don't do anything special other than generate a script that will have the effect of doing exactly the opposite of what is in the log file.
Best Answer
The uncommitted transactions are not escaped. They do get replayed. However, any process running across the new data from these transactions know to ignore them as they are not committed. And any data marked for removal by an uncommitted transaction is not ignored, because the removal has not yet been committed. This is the same way that in-progress transactions are ignored during normal operation, i.e., when there has been no crash.
Once the transaction is known to be aborted, then anyone who stumbles upon data from that transaction can clean it up. Once recovery has finished, any transaction which has not had its commit record replayed by the end of recovery is known to be aborted.