Postgresql – Best way to ALTER date column to timestamp on a large table

lockingperformancepostgresqlpostgresql-performance

I have to ALTER a date field to timestamp on a huge table having 1 billion records.

I am using Postgres 9.5. I learnt that when we use alter query a heavy “access exclusive” lock is acquired, which shuts everything else out of the table.

How can we achieve this efficiently?

Best Answer

How to do it efficiently and how to do it without blocking everyone else for the duration are diametrically opposed goals. The efficient way is to schedule a downtime window in which to do it.

If that can't be done, then:

Add a new column of the desired type, with a NULL default value.
Change all the code which changes that old column or inserts new rows to make sure they correctly populate the new column as well (but make sure they don't depend on the new value already being correct--they just make it be correct)
Create an index to efficiently find rows where the new column is still NULL
Write a script to update rows with NULL values of new column to set them to be correct based on the value of the old column. Do this in chunks that are as large as feasible, given that the first row in the chunk will be locked for the duration of the processing of the chunk.
Once all chunks of this are done and verified, change all the code to rely on the new column rather than the old column.
Change all the code to stop trying to maintain the old column
Drop the old column
Drop the index that supports finding still-NULL values of the new column (unless it is still useful)

Postgres 9.2 or later

You can make the index covering by appending fcv_id:

CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante, fcv_id);

This way, provided the table isn't updated too much, Postgres can retrieve results with an index-only scan.

The additional column comes last since it does not contribute to the sort order. Explanation:

Is a composite index also good for queries on the first field?

In Postgres 11 or later you could make that:

CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante) INCLUDE (fcv_id);

`CLUSTER` / `pg_repack`

I see you already found CLUSTER. You are aware that this is a one-time operation, that should help your cause, but needs to be re-run after enough updates?

There is also the community tool pg_repack as replacement for VACUUM FULL / CLUSTER.

`work_mem`

This line in your EXPLAIN output:

Sort Method:  external merge  Disk: 2928kB

tells us, that sorting is not done in RAM, which is expensive. You could probably improve performance by tuning the according setting for work_mem

work_mem (integer)

Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. ...

Setting this too high may have adverse effects. Read the manual carefully. Consider increasing the setting only for the transaction with the big query:

BEGIN;
SET LOCAL work_mem  = '50MB';
SELECT ...;
COMMIT;

50 MB are an estimate based on your EXPLAIN ANALYZE output for 73k rows. Test with 1M rows to get the actual amount you need.

PostgreSQL Verbose Logging – How to Enable Detailed Activity Logs

The idle in transaction state shown in the pg_stat_activity output generally means that there's something that starts a transaction with BEGIN, does some work, and then doesn't issue the COMMIT so that the transaction is finished. This has the effect of leaving the ACCESS SHARE locks acquired by the SELECT statements hanging around, which block the ALTER TABLE statement you're trying to run from acquiring the ACCESS EXCLUSIVE lock it needs to change the type of a column for every row in the table.

They could only be considered "ghost queries", in the sense that they are hanging around waiting for an answer of COMMIT or ROLLBACK that will probably never come. But they are problematic, nonetheless as you figured out.

There might be a bug somewhere in your application that has left all these connections open and waiting for a commit. I'd definitely check that out, and see if perhaps a code block doesn't have a terminating COMMIT.

More commonly (in my experience, YMMV of course), people start database sessions with a BEGIN, do some work, and then go to lunch and never close their session.

It's best to fix the problem at the source, but if you absolutely, positively need to do something right now, pg_cancel_backend and pg_terminate_backend are not bad ways to go.

Commenting on your answer, I'd specifically select the state of idle in transaction, rather than LIKE '%idle%' because those are the ones causing your problems. The idle state backends are just waiting around patiently for new client commands, and had last ran a ROLLBACK command.

You can look at the Explicit Locking documentation for more information about what blocks what.

The description of idle-in-transaction is in the Unix Monitoring Tools section for reference.

pg_cancel_backend and pg_stat_activity issue SIGINT and SIGTERM signals, respectively, and the effects of those signals (and warnings about other signals) are documented in the Server Shutdown section.

Hope that helps explain what you saw and sheds some light on your mystery. =)

Best Answer

Related Solutions

Optimize PostgreSQL Query – Using ORDER BY Date and Text

Postgres 9.2 or later

CLUSTER / pg_repack

work_mem

PostgreSQL Verbose Logging – How to Enable Detailed Activity Logs

Related Question

`CLUSTER` / `pg_repack`

`work_mem`