Postgresql – Query hanging in ClientRead and blocking all others

lockingpostgresqlsequelize

This is related to Debugging a hanging session/lock

Every so often (~1-2 times a month) we have our database lock up because there is one query that doesnt finish, and that has requested locks that block all others. This is on a table that is core to our business, so it results in downtime.

The query in question is one of a couple of candidates, all of them are just simple UPDATEs running on just 1 row. The best lead I have so far (I am open to other avenues) is from RDS performance insights I see that while all other queries are locked on "tuple", there is always 1 query that is locked on "ClientRead". From my research, this appears to be Postgres waiting on the client lib to send it the bound parameters or similar. So my theory is that there is one query that randomly sometimes ends up in this state, and this blocks all others after it. Is this a reasonable evaluation?

If this is the case, how can I go into investigating and resolving it? We use sequelize 5.9.4 (node-js). We did upgrade it recently, ~1-2 months back, so I am wondering if there might be a bug in this version which might cause a error on the sequelize side when processing a query, is that possible? It could be a bug in node-postgres as well, this came up in my searching https://github.com/brianc/node-postgres/issues/1952.

I will try to update with a screenshot of RDS performance insights next time it happens, its retention was shorter than anticipated.

Best Answer

You have a problem with transaction management.

A session that waits for ClientRead is done processing the last query and waits for the client to send the next request. The only way that such a session can block anything is if its state is idle in transaction. All locks are held until the transaction ends, and no locks are held once the transaction finishes.

If sessions remain in state idle in transaction for a longer time, that is always an application bug where the application forgot to end the transaction. This is bad, not only for the reason you observe, but also because it keeps autovacuum from doing its essential job.

There are two things you can do:

  1. Fix the application so that it properly closes transactions. This is the only good solution.

  2. Prevent the worst from happening by setting idle_in_transaction_session_timeout so that these transactions are automatically rolled back by PostgreSQL after a while. This will keep locks from being held indefinitly, but the buggy application will receive an error.