Postgresql – Postgres reindex on crash folklore

crashindexpostgresql

I'm struggling with a problem of folk memory re Postgres.

The folklore

I am told that a long time ago in a galaxy far, far away (PostgreSQL 8.4 under Ubuntu 12.04), PostgreSQL frequently suffered from an issue where if the machine holding the server failed (e.g. kernel crash), when the machine was restarted, Postgres would start but the indexes on the database might be corrupted. For instance

SELECT foo FROM bar WHERE baz = 123

would fail because the index on bar.baz would be corrupted. Reindexing the table fixed the issue but as there was no way to detect this issue, and this was in an appliance environment (no DBA on hand), in essence to guard against this it would be necessary to reindex all tables on a reboot, which would be prohibitively expensive. Thus (the reasoning goes) without a DBA on hand it would not be sensible to use Postgres particularly in an environment where tables were large and computing indices expensive.

Because of this folklore, MySQL is preferred in such circumstances.

My theory

  1. I dimly recall seeing this issue happening under 8.4, so I think there's some truth in it.

  2. I am assured this also has happened on Postgres 9.3 under Ubuntu 14.0.4. I don't know whether it happens 'frequently'. The system it happened on is running with (admittedly old) enterprise grade SAS drives which claim write caching is turned off, so I don't think this is a write caching problem.

  3. Were this still the case, I think I'd be able to find more evidence of it on the web than I can find.

  4. Thus I suspect it may be fixed, or may be a result of something bad in postgresql.conf (though inspection suggests only the autovacuum settings are changed from the defaults).

Further information

  1. In the system I know did this recently (using 9.3) no LVM is in use. The storage system is ext4 on a physical drive mounted with default attributes. This drive is an (admittedly old) enterprise SAS drive plus SAS1068 PCI-X Fusion-MPT SAS controller which sdparm claims it doesn't have a write cache switched on, but this has (allegedly) been seen on multiple drives with different vendors etc., though always on ext4 on Ubuntu.

  2. fsync is in the default state (on I believe).

  3. No master/slave configuration – just a single master – (so it's not the multixact bug).

  4. Tables are created with default indexing (no fancy hash indexes).

  5. Postgres regularly updated to whatever the relevant Ubuntu LTS release has. The latest problem was on 14.04.

Questions

  1. Is manual rebuilding of indices an expected task after a system crash?

  2. If not, has it ever been, and when was that changed?

  3. Is it possible to avoid the need for manual reindexing?

Best Answer

TL;DR: Except for a specific bug in PostgreSQL 9.3 prior to 9.3.4 which it's vital that you patch, index corruption after crash is never expected and hasn't been since PostgreSQL 7.1 was released in 2001.

The advice/folklore you got was wrong.


I am told that a long time ago in a galaxy far, far away (PostgreSQL 8.4 under Ubuntu 12.04), PostgreSQL frequently suffered from an issue where if the machine holding the server failed (e.g. kernel crash), when the machine was restarted, Postgres would start but the indexes on the database might be corrupted.

That way predates 8.4 unless you're using hash indexes, which are documented not to be crash safe and are not recommended.

WAL logging of indexes was introduced in the deep dark distant past of mid-2001. PostgreSQL has been crash safe for an exceedingly long time. That advice was wrong.

MySQL switched to crash-safe by default (by defaulting to InnoDB tables) in MySQL 5.5, in 2009. Prior to that its default storage engine, MyISAM, isn't crash-safe and will happily corrupt indexes and tables. So your folklore isn't only wrong, it's backwards.

If you had corruption after a crash under 8.4, then either:

  • you hit a bug (not impossible, but not likely either) in PostgreSQL, the kernel, the filesystem, etc;
  • You were using LVM before write-barrier support was fixed, and had barriers enabled on your file system;
  • your storage subsystem lied to the kernel about fsync(); or
  • you disabled crash safety by setting fsync=off.

There have been a couple of index-related bugs over time, but all the others I know of have been minor and/or hard to trigger, and few have been related to crash recovery.

The issue with 9.3 is different. That's a very specific bug that we're all rather embarrassed about. It occurred when a replica PostgreSQL server was promoted to a master after certain other operations had happened in a particular order. It's a really bad bug because it affects the table contents, not just indexes, so it's vital that you patch to 9.3.5 or higher. To learn more about it than you ever wanted to search for "multixact bug".

So:

Is manual rebuilding of indices an expected task after a system crash?

Not unless you are using hash indexes, which the manual specifically wants you about.

If not, has it ever been, and when was that changed?

PostgreSQL 7.1, back in 2001.

If you read the release notes you'll find the occasional bug and bug fix, but other than the aforementioned multixact bugs in 9.3, nothing too serious.

Is it possible to avoid the need for manual reindexing?

Do nothing. Congratulations, you don't need manual reindexing.

Simply:

  • Leave fsync set to on, the default
  • Leave full_page_writes set to on, the default
  • Don't kill -9 the postmaster but no other processes then delete postmaster.pid and start PostgreSQL again. The latest releases even detect it if you do this and try to prevent damage.
  • Don't explicitly specify hash indexes when creating an index. PostgreSQL will never use a hash index unless you ask it to.

Bonus points if you:

  • Keep PostgreSQL reasonably up to date, i.e. apply point release patches
  • Use decent quality SSDs if you're going to use SSDs, otherwise any database will be prone to corruption. Make sure they have a supercapacitor or other power-loss protection.