PostgreSQL – How to Perform Rollback Recovery

postgresqlrecoverywrite-ahead-logging

Im familiar with postresql WAL file and its purpose. Every transaction and the commit records of each transaction will be logged in this file .While recovery, it replays from the last checkpoint till the end to bring the database to a stable state. But there is a good chance that the uncommitted transactions would have made their way into the WAL files. How does postgres escape these uncommitted transaction records from being replayed? How is roll-backward recovery(UNDO ing the uncommitted transactions) implemented in postgresql?

Best Answer

The uncommitted transactions are not escaped. They do get replayed. However, any process running across the new data from these transactions know to ignore them as they are not committed. And any data marked for removal by an uncommitted transaction is not ignored, because the removal has not yet been committed. This is the same way that in-progress transactions are ignored during normal operation, i.e., when there has been no crash.

Once the transaction is known to be aborted, then anyone who stumbles upon data from that transaction can clean it up. Once recovery has finished, any transaction which has not had its commit record replayed by the end of recovery is known to be aborted.

Related Solutions

Postgresql – How does PostgreSQL handle Checkpoints in the middle of a WAL-enabled backup

You asked:

how postgreSQL will handle the recovery with a pg_data content containing some files which are inconsistent.

pg_start_backup() ensure the data file is at least as new as the checkpoint. On recovery, the logs are applied.

If the data is old, the log will update it..

If the data is new, the log will have same content. There is no hurt writing it again.

The data are never newer then the log, because the logs are write ahead (WAL).

You asked:

... xfs-freeze ...

xfs-freeze is alike to pg_start_backup(), it don't take a snapshot. You need a volume manager to do that.

You asked:

... why do create tablespace & create database statements are unsupported if the WAL can replay everything?

It is supported, just some little gotcha. See http://www.postgresql.org/docs/8.1/static/backup-online.html :

23.3.5. Caveats

CREATE TABLESPACE commands are WAL-logged with the literal absolute path, and will therefore be replayed as tablespace creations with the same absolute path. This might be undesirable if the log is being replayed on a different machine. It can be dangerous even if the log is being replayed on the same machine, but into a new data directory: the replay will still overwrite the contents of the original tablespace. To avoid potential gotchas of this sort, the best practice is to take a new base backup after creating or dropping tablespaces.

SQL Server Transaction Log File – Detailed Contents Explained

The difference is that what you call "standard commands" have implicit transactions (as in "not explicit" and not real implicit transactions which mean something different), so every time you issue an INSERT command without an explicit transaction, it will open a transaction, insert the data and automatically commit. This is called an autocommit transaction.

This is also why you can't rollback this INSERT: it's already committed. So the rule is the same as explicit transactions: you can't rollback once they've been committed.

You can see what I mean directly from inside SQL Server.

Microsoft ships SQL Server with a DMF called sys.fn_dblog that can be used to look inside the transaction log of a given database.

For this simple experiment I'm going to use the AdventureWorks database:

USE AdventureWorks2008;
GO

SELECT TOP 10 *
FROM dbo.Person;
GO

INSERT INTO dbo.Person (FirstName, MiddleName, LastName, Gender, Date)
VALUES ('Never', 'Stop', 'Learning', 'M', GETDATE());
COMMIT;

BEGIN TRAN;
INSERT INTO dbo.Person (FirstName, MiddleName, LastName, Gender, Date)
VALUES ('Never', 'Stop', 'Learning', 'M', GETDATE());
COMMIT;
GO

SELECT *
FROM sys.fn_dblog(NULL, NULL);
GO

Here I'm doing two inserts: one with and one without an explicit transaction.

In the log file you can see that there's absolutely no difference between the two:

Autocommit vs Explicit Transactions

The red one is the INSERT within an autocommit transaction and the blue one is the INSERT with an explicit transaction.

As for the 3rd party tools you mention, yes they analyse the database log and generate normal T-SQL code to "undo" or "redo" the operations. By normal I mean they don't do anything special other than generate a script that will have the effect of doing exactly the opposite of what is in the log file.

Best Answer

Related Solutions

Postgresql – How does PostgreSQL handle Checkpoints in the middle of a WAL-enabled backup

SQL Server Transaction Log File – Detailed Contents Explained

Related Question