Postgresql – Stripping OIDs from tables in preparation for pg_upgrade

postgresql

I have a postgres database in RDS, file size approaching 1TB. We started in 2005, using ruby/activerecord/rails, and along the way have upgraded to PG 9.6

Rails migrations create tables using CREATE TABLE, never specified whether to include OIDs or not, but never used them. So some of our oldest (and largest) tables have OIDs. At some point postgres stopped defaulting to creating tables with OIDs, so tables created more recently don't have this issue.

We're looking to do an upgrade 9.6 -> 12, ideally using pg_upgrade. This fails complaining about tables having OIDs.

We can do ALTER TABLE :table_name: SET WITHOUT OIDS but for our larger tables that takes several hours, locking the table, effectively taking down the database. We would prefer to avoid the downtime if possible.

Is it expected that an ALTER TABLE :table_name: SET WITHOUT OIDS should rewrite all the rows of a table?

Is there a way to avoid this rewriting? (Ordinary columns can be dropped without rewriting all the rows, for example.)

I tried on a toy database mucking with the metadata tables:

UPDATE pg_catalog.pg_class SET relhasoids = 'f' WHERE oid = (
  SELECT c.oid FROM pg_catalog.pg_class c
    JOIN pg_namespace n ON n.oid = c.relnamespace
   WHERE n.nspname = 'public' AND c.relname = 'name_of_table'
);

It executed quickly and a cursory examination of the data didn't show any corruption.

But, this doesn't seem to be a documented approach.

Is this a terrible idea? Are there other approaches that don't require rewriting whole tables?

Best Answer

That UPDATE might work, but I am not certain, and I wouldn't want to do that with data I value. You don't have the option anyway, since you cannot get superuser access in a hosted database.

I can think of a safe, but more painful method:

  1. Briefly suspend data modification activity on the table.

  2. Create a trigger that records all data modification activity in another table.

  3. Create a new table that looks like the old one, and start an INSERT INTO ... SELECT ... that copies the data.

Now normal operation can be resumed.

Once the copy is done, replay the recorded changes. Then

  1. Start a transaction.

  2. LOCK the original table.

  3. Replay all the changes that happened since the last replay (should be few).

  4. DROP the original table.

  5. Rename the copy to the original table name.

Foreign keys will require extra attention.