PostgreSQL – Slow UPDATE Queries with GIN Index

gin-indexindexperformancepostgresqlpostgresql-performance

The Setup

I am running PostgreSQL 9.4.15 on an SSD-based, quad-core Virtual Private Server (VPS) with Debian Linux (8). The relevant table has approximately 2-million records.

Records are frequently being inserted and even more frequently (constantly — every few seconds at least) updated. As far as I can tell, I have all appropriate indexes in place for these operations to execute snappily, however, and the vast majority of the time they do execute instantly (in milliseconds).

The Problem

Every hour or so, however, one of the UPDATE queries takes an excessive amount of time — like 10 seconds or more. And when this happens, it's usually like a "batch" of queries that get "blocked", all terminating at roughly the same time. It's as if one of the queries, or some other background operation (e.g., a vacuum) is blocking them all.

Schema

The table, items, has many columns, but I think the following are the only ones possibly relevant to the problem:

id INTEGER NOT NULL (primary key)
search_vector TSVECTOR
last_checkup_at TIMESTAMP WITHOUT TIME ZONE

And these are the relevant indexes:

items_pkey PRIMARY KEY, btree (id)
items_search_vector_idx gin (search_vector)
items_last_checkup_at_idx btree (last_checkup_at)

Likely Culprits

Finally, after rigging together a little script to dump the contents of pg_stat_activity (the list of all active Postgres connections/queries) whenever a "connection leak" warning is emitted to my log-file, I've narrowed down the possible culprit queries/columns (assuming the problem isn't external, like with a misbehaving VPS). These are, roughly, the kinds of queries that seem to appear again and again:

UPDATE items SET last_checkup_at = $1 WHERE items.id = 123245
UPDATE items SET search_vector = [..] WHERE items.id = 78901

Those are slightly paraphrased, but I truly doubt anything relevant is missing. Occasionally other queries (on other tables) appear as well, but those usually look like they were just "unlucky" to get caught in the mix.

Now, even though the first query (setting last_checkup_at) tends to appear most of the time, the query that sets search_vector seems to appear every time. (And in addition, there are probably many more instances of the first query being issued in general, making it more likely to just be there on happenstance.)

(I think I'm winnowing in on a solution here, but even if I have it in the bag I wanted to document the incident here for others… I've been mystified by this problem for months, before getting a chance to deep-dive.)

Best Answer

The problem seems to have been Postgres's "FASTUPDATE" mechanism.

FASTUPDATE is a setting available on GIN indexes which, when enabled, causes changes to the index (caused by UPDATEs and presumably INSERTs as well) to be "queued up". Then, once this "queue" becomes too large, the pending entries are properly integrated into the GIN index.

The aim of FASTUPDATE is (no surprise) to speed up index updates, but it unfortunately leads to an occasional UPDATE query being exceptionally slow. In my case, I found it preferable to take the hit up-front (mainly to avoid warnings of a "slow query" in my logs).

FASTUPDATE is apparently enabled by default and available since PostgreSQL 8.4. I was able to disable it like this:

ALTER INDEX items_search_vector_idx SET (FASTUPDATE=OFF);

At the time of writing, I have been running as such for almost a week with almost no slow queries. (Aside one query that I expect to take a long time, I've noticed little else.)

You may also find more pertinent information in a related thread from the Postgres mailing list. Interestingly, one of the Postgres devs (Tom Lane) suggests processing of the FASTUPDATE pending items "was not supposed to block concurrent insertions", but I'm not sure if that's correct; in my case I would see several queries get "backed up" and then finish all at once.

ADJUSTMENT #1 : Bigger InnoDB Redo Logs

Since I I do not see innodb_log_file_size, I assume you have the default of 5M. Since your innodb_buffer_pool_size = is 4G, you need 1G redo logs.

ADJUSTMENT #2 : Have InnoDB uses all CPUs

Out of the box, InnoDB does not use all CPUs. I wrote a post long ago about how InnoDB LEFT UNCONFIGURED may work faster in older versions. I also wrote posts about multicore engagement for InnoDB:

With these things said, here are the adjustments to make

cp /etc/my.cnf /etc/my.cnf_old

Add this setting to /etc/my.cnf

[mysqld]
innodb_log_file_size = 1G
innodb_io_capacity = 20000
innodb_read_io_threads = 5000
innodb_write_io_threads = 5000

Next, run these steps

service mysql stop
mv /var/log/mysql/ib_logfile0 /var/log/mysql/ib_logfile0_old 
mv /var/log/mysql/ib_logfile1 /var/log/mysql/ib_logfile1_old
service mysql start

Now, all cores with assist InnoDB and there is much more room for transaction isolation

Give it a Try !!!

Postgresql – Slow queries on billions-rows-table // index used

This may reflect my MS SQL bias, but I'd try clustering the table by timestamp. If you're frequently pulling data for a specific time span, this will help because the data will be physically stored contiguously. The system can seek to the start point, scan to the end of the range, and be done. If you're querying for a specific hour, that's just 3,600,000 records.

If your query (which is...?) is for a specific machine, Postgres will then need to filter out 99.9% of those 3.6 M records. If this one-in-a-thousand filter is more selective than a typical date range fitler, you should use the more selective mac field as the first component of your index. It may still be worth clustering.

If that still doesn't do it, I'd partition by the same field you're indexing, either timestamp or mac.

You didn't give the data types. Are they appropriate to the data? Storing dates as text will needlessly bloat your table, for example.