Firebird database recovery and protection

backupfirebirdrecovery

I'm sure many companies are running critical Firebird database servers in production 24/7. Other than regular backups, what backup/recovery/protection techniques are available? I am also interested in backup policies for high availability databases.

Best Answer

  • Incremental backups - using nback utility in FB 2.x

Back-up is very fast (just dumping of changing pages), can use cascading (database -> monthly large snapshots -> daily deltas from last monthly -> hourls delta form last daily).

However it does not optimize (by recreating) the database. And it does not detect database errors other than top-level, errors of pages allocation (like "orphane page")

OTOH the pages snapshotted are mostly intact, so in case of partially corrupt database they might still have a manually salvageable data. If the corruption would be noticed quickly.

In a sense, that amounts to safe and incremental copying of he database file(s), wit hall the cons and pros.

  • Usual backup via "gbak"

Reads data as a usual SNAPSHOT transaction, thus db errors would effect it. Some db erors would manifest it in read errors (like if DBA changed column type in a way incompatible with data), but other might result in some data being "invisible" and skipped.

Backup file is stripped of fast-access metadata and geta a lot smaller, which is good for archives (example: 1.1GB of raw database -> 380 MB FBK -> ~700 MB after restore).

In FB 2.x GBak is known to work considerably slower via TCP/IP than via Firebird Service Manager connection. It is told to be amended in forecoming FB 3.

Restore is basically recreating database, so it is slow. But it optimizes the database internal layout and saves some space (no more half-empty pages in the middle of the file).

Due to Firebird being very liberal in online (during active operations of users) scheme change (the safe approach was enforced in 2.0 but undone in 2.1 after uproar), the traditional backup might be "unrestorable", so the attempt at restoring a FBK file into a spare free disk is a must. Until you proven you can restore that backup, you may consider you don't have it.

  • Tangential idea is Garbage Collection. Usually database have "peak hours" for example heavily used in day and almost unused at night. Sometimes DBA turn off garbage collection and reserve a lot of free space so during the day the database might grow almost uncontrollably. Then at night gbak is run not only to make a copy, but to enforce garbage collection at last.

  • There also is a built-in redundancy: SHADOW SERVERS implemented back in Borland days. It does not protect from malicious users, virus, program or server bugs, etc. But in case one server crashes like OS or hardware fault, another one is ready to take his mission instantly.

OTOH is the main and shadow servers were in different networks (offices, cities, countries) and the link disappears between them, then one of the networks would see it as crash of the main server and another would see it as a crash of shadow server. When the link would be repaired, the databases would have new different conflicting data entered by users.

  • This moves us to yet another idea of database replication. IOW the system in general might be designed to the idea tat there is more than single database/server, but they still have the same scheme and should periodically exchange data changes. That is a whole different topic, but it might be to some extent seen as a somewhat partial substitute to backups or crash defense.

PS. additionally you have to learn the difference between SuperServer (targeted at small installations) and Classic/SuperClassic servers. For running 24/7 the second options would be preferable, since frequently the server instances would be shut down after user disconnects. So while "server" as a concept keeps running 24/7, the actual executable programs of it get closed and restarted, easening at potential problems like memory leaks in server or UDFs. OTOH Classic server is more vulnerable to cache synchronization issues like in case of crash during Garbage Collection or attempts at metadata (scheme) changes while users are working.

In FB 3.x they promise to integrate those two approaches to make it a kind of sliding scale options in firebird.conf