Postgresql – Query hanging in ClientRead and blocking all others

lockingpostgresqlsequelize

This is related to Debugging a hanging session/lock

Every so often (~1-2 times a month) we have our database lock up because there is one query that doesnt finish, and that has requested locks that block all others. This is on a table that is core to our business, so it results in downtime.

The query in question is one of a couple of candidates, all of them are just simple UPDATEs running on just 1 row. The best lead I have so far (I am open to other avenues) is from RDS performance insights I see that while all other queries are locked on "tuple", there is always 1 query that is locked on "ClientRead". From my research, this appears to be Postgres waiting on the client lib to send it the bound parameters or similar. So my theory is that there is one query that randomly sometimes ends up in this state, and this blocks all others after it. Is this a reasonable evaluation?

If this is the case, how can I go into investigating and resolving it? We use sequelize 5.9.4 (node-js). We did upgrade it recently, ~1-2 months back, so I am wondering if there might be a bug in this version which might cause a error on the sequelize side when processing a query, is that possible? It could be a bug in node-postgres as well, this came up in my searching https://github.com/brianc/node-postgres/issues/1952.

I will try to update with a screenshot of RDS performance insights next time it happens, its retention was shorter than anticipated.

Best Answer

You have a problem with transaction management.

A session that waits for ClientRead is done processing the last query and waits for the client to send the next request. The only way that such a session can block anything is if its state is idle in transaction. All locks are held until the transaction ends, and no locks are held once the transaction finishes.

If sessions remain in state idle in transaction for a longer time, that is always an application bug where the application forgot to end the transaction. This is bad, not only for the reason you observe, but also because it keeps autovacuum from doing its essential job.

There are two things you can do:

Fix the application so that it properly closes transactions. This is the only good solution.
Prevent the worst from happening by setting idle_in_transaction_session_timeout so that these transactions are automatically rolled back by PostgreSQL after a while. This will keep locks from being held indefinitly, but the buggy application will receive an error.

Related Solutions

MySql Lock Tables Blocking Logic

Do not use LOCK TABLES with InnoDB unless you have a very special need.

Instead, learn about "transactions" and use BEGIN...COMMIT to effect most "locking" needed. See also SELECT ... FOR UPDATE....

Why Fast PostgreSQL Queries Appear in Slow Query Logs

What you see in the logs is a prepared statement execution. Using prepared statements is the usual way to interact with the database from an application layer.

For example, using Hibernate, one could write something like this (hopefully, the code snippet below is valid, did not test it):

String sql = "SELECT first_name, last_name FROM customer WHERE email = :customer_email";
...
query.setParameter("customer_email", Customer.email);

Among others, this is a good way to avoid SQL injection - unlike building a full query text, including the parameter, using concatenation.

One can do this from a PostgreSQL client, too:

PREPARE fetch_customer (text) AS
    SELECT first_name, last_name FROM customer WHERE email = $1;

[EXPLAIN ANALYZE] EXECUTE fetch_customer ('john@gmail.com');

On the other hand, and this is what you experience, a prepared query can result in inferior performance. A prepared statement cannot make an advantage from knowing what the values passed to it will be - simply because they are unknown at the preparation time. Thus, it has to choose a query plan that is generic enough to get any possible result in an acceptable time.

For example, imagine you had a partial index like

CREATE INDEX ON xxxx.campaign (campaign_id) WHERE campaign.id > 4000000000000;

(which does not make much sense in real life, but never mind).

When the planner knows it has 4000013578869 as campaign_id, it can opt for using this index, possibly significantly reducing the number of data pages to fetch. However, a prepared statement cannot do this, as some (most?) of the possible campaigns would be excluded.

Best Answer

Related Solutions

MySql Lock Tables Blocking Logic

Why Fast PostgreSQL Queries Appear in Slow Query Logs

Related Question