Data guard / network failure: deadline before global crash

dataguardoracle-11g-r2

In a Maximum Availability scenario where there's one primary and one physical standby database, how long can we expect the system to run without problem when the standby site becomes out of reach (example: network error, disaster on the B site etc.)

The log archive dest state setting must be set to DEFER. But does it simply mean everything will roll fine until my local hard drive (primary database) explodes the +REDO area?

Is a simple anticipation enough to calculate the "autonomy" of the primary database or is there a way to guarantee it?

Best Answer

If the standby site becomes unreachable, the primary site will not delete archivelogs as they will be still needed by the standby site. If you try to backup and delete these archivelogs, you will receive an ORA-08137 error, and eventually your archive area will become full and your database will stall.

Setting log_archive_dest_state_n to DEFER lifts this "restriction". If the stanbdy database becomes unreachable, you can manually set log_archive_dest_state_n to DEFER, with that, you disable that destination, and the primary site will be able to delete the archivelogs.

An other way to work around this is to write your backup scripts in a way that is prepared to handle this situation. By adding the FORCE option to a DELETE operation, RMAN skips the above check and deletes the archivelogs specified regardless of the state of the standby database. For example adding this line to the end of your backup script:

delete force noprompt archivelog until time 'sysdate -1' backed up 1 times to 'sbt_tape';

The above command deletes all archivelogs older than 1 day and backed up at least once to your backup server.

Related Question