Did Adding JSONB Indexes Bloat the PostgreSQL Database?

indexjsonpostgresql

We are running:

user@host:~$ psql -d database -c "SELECT version();"                                                                   
version
---------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 (Ubuntu 10.7-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
(1 row)

on:

user@host:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:    16.04
Codename:   xenial

and have the following setup:

database=# \d+ schema.table
                                                                  Table "schema.table"
           Column            |            Type             | Collation | Nullable |                     Default                     | Storage  | Stats target | Description
-----------------------------+-----------------------------+-----------+----------+-------------------------------------------------+----------+--------------+-------------
 column_1                    | bigint                      |           | not null | nextval('table_id_seq'::regclass)               | plain    |              |
 column_2                    | character varying           |           | not null |                                                 | extended |              |
 column_3                    | character varying           |           | not null |                                                 | extended |              |
 column_4                    | character varying           |           | not null |                                                 | extended |              |
 column_5                    | timestamp without time zone |           | not null |                                                 | plain    |              |
 column_6                    | timestamp without time zone |           |          |                                                 | plain    |              |
 column_7                    | character varying           |           | not null |                                                 | extended |              |
 column_8                    | jsonb                       |           | not null |                                                 | extended |              |
 column_9                    | jsonb                       |           |          |                                                 | extended |              |
 column_10                   | character varying           |           | not null |                                                 | extended |              |
 column_11                   | character varying           |           | not null |                                                 | extended |              |
 column_12                   | character varying           |           |          |                                                 | extended |              |
 column_13                   | character varying           |           |          |                                                 | extended |              |
 column_14                   | timestamp with time zone    |           | not null |                                                 | plain    |              |
 column_15                   | timestamp with time zone    |           | not null |                                                 | plain    |              |
Indexes:
    "table_pkey" PRIMARY KEY, btree ( column_1 )
    "table_idx_1" btree ( column_11)
    "table_idx_2" btree ( column_4, column_2, column_7, column_5, column_6 )
    "table_idx_3" btree ( column_7, column_11, column_15 )
    "table_idx_4" btree ( column_7, column_11, column_14 )
    "table_idx_5" btree ( column_7, column_11, column_5 )
    "table_idx_6" btree ( column_7, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER ), column_5 )
    "table_idx_7" btree ( column_15 )
    "table_idx_8" btree ( column_4, column_2, column_7, column_5, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER ) )
    "table_idx_9" btree ( column_4, column_2, column_7, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER) )
    "table_idx_a" btree ( column_7, column_4, column_2, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER), ( ( column_8 ->> 'string_2'::TEXT )::INTEGER ) ) WHERE column_7::TEXT = 'string_3'::TEXT
Check constraints:
    "table_check_constraints" CHECK ( lower( column_10::TEXT ) <> 'string_4'::TEXT OR column_9 IS NOT NULL AND column_6 IS NOT NULL )

Autovacuum is on and configured with:

user@host:~$ psql -d database -c "SELECT name, setting, pending_restart FROM pg_settings WHERE NAME ILIKE '%autovacuum%' ORDER BY name;"
                name                 |  setting              | pending_restart
-------------------------------------+-----------------------+-----------------
 autovacuum                          | on                    | f
 autovacuum_analyze_scale_factor     | 0.002                 | f
 autovacuum_analyze_threshold        | 10                    | f
 autovacuum_freeze_max_age           | 200000000             | f
 autovacuum_max_workers              | 5                     | f
 autovacuum_multixact_freeze_max_age | 400000000             | f
 autovacuum_naptime                  | 30                    | f
 autovacuum_vacuum_cost_delay        | 10                    | f
 autovacuum_vacuum_cost_limit        | 1000                  | f
 autovacuum_vacuum_scale_factor      | 0.001                 | f
 autovacuum_vacuum_threshold         | 25                    | f
 autovacuum_work_mem                 | -1                    | f
 log_autovacuum_min_duration         | 0 (env 1) /-1 (env 2) | f
(13 rows)

The following sequence of events took place in environment 1, during which autovacuum was on and configured as above:

Nightly VACUUM (VERBOSE, ANALYZE) of database added.
Some time passes during which bloat is at normal operational level.
Nightly VACUUM (VERBOSE, ANALYZE) of database is removed.
Index table_idx_8 that includes a JSONB data type column is added.
Index table_idx_9 that includes a JSONB data type column is added.
Bloat growth spurt starts and continues for 2 days until it peaks.
VACUUM (VERBOSE, FULL) of table.
Bloat returns to normal operational levels and stays there.

The database size (GB) looked like this in environment 1 during this sequence of events:

And this is what bloat (GB) looked like in environment 1:

The number of live rows in environment 1:

The number of dead rows in environment 1:

The following sequence of events took place in environment 2, during all of which autovacuum was on and configured as above:

Nightly VACUUM (VERBOSE, ANALYZE) of database added.
Some time passes during which bloat is at normal operational level.
Nightly VACUUM (VERBOSE, ANALYZE) of database is removed.
Index table_idx_8 that includes a JSONB data type column is added.
Index table_idx_9 that includes a JSONB data type column is added.
Bloat growth spurt starts and continues for 2 days until it peaks and brings DB down (disk full).
TRUNCATE TABLE schema.table.
The schema.table table is populated again.
Bloat does not stabilise and grows until it peaks again.
TRUNCATE TABLE schema.table before the disk fills up again.
VACUUM (VERBOSE, FULL) of database.
The schema.table table is populated again.
Bloat continues to grow!

The database size (GB) in environment 2 looked like this during this sequence of events :

And this is what bloat (GB) looked like in environment 2:

The only difference between these two environments is that they are speced slightly differently (with 2 being less powerful). During these sequences of events, write/read volumes were unchanged on each environment. We are using this query to measure the bloat in bytes.

I have cross checked PostgreSQL logs, monitoring logs and commit logs (Git) and identified the addition of the two indexes as the trigger for the bloat but:

Is that right? Can adding an index trigger such a bloat growth spurt?
Why did adding the indexes trigger the bloat, if it did?
Why did environment 1 stabilise and environment 2 not?
How can we stabilise environment 2?

Any help with answering these questions would be greatly appreciated and needless to say, I'm happy to provide any other information that I have missed that could be helpful.

Best Answer

I don't think the archeological approach will be very useful here. There is just too much missing info and confounding variables. For example, people usually don't just add indexes for no reason. If a change in work load motivated the index creation, it could be the change in workload, independent of the index, which is causing the bloat.

There are a lot of theories that explain what you see, but really no way to distinguish between them based on the history given. Each index gives vacuum more work to do, so your new indexes might have just pushed it over the tipping point just because it was close already, without regard to what the content of the indexes are. Or maybe a lot of work piled up while the table was locked for index creation, and then the frenzy of activity once the lock was released pushed it over the edge. It is isn't just more indexes that create more work for vacuum--bloat does as well. This can lead to a vicious cycle where more bloat slows down the vacuum, leading to more bloat yet. That could be the reason environment 1 stabilized after the VACUUM FULL, it broke the vicious cycle to the point regular vacuums can now keep up. If environment 2 is less powerful, the tipping point might be lower and the VACUUM FULL there might leave it still beyond that point.

 autovacuum_analyze_scale_factor     | 0.002                 | f
 autovacuum_analyze_threshold        | 10                    | f
 autovacuum_vacuum_scale_factor      | 0.001                 | f
 autovacuum_vacuum_threshold         | 25                    | f

These settings seem rather ridiculous at first glance. Is there a rationale for them? It could be spending so much time vacuuming and analyzing tables that don't really need them, that it can't keep up on the table that does need vacuuming (but if you have only one large table, that might not be much of a concern). Lowering the scale factors can make sense, but generally only in conjunction with an increase of the thresholds.

I routinely set "vacuum_cost_page_hit" to zero "vacuum_cost_page_miss" to zero. In my experience, the concurrent performance problems caused by autovac are generally caused by the writing, not the reading, so there is no point in throttling on the read side. This could be especially important when you have tables and indexes which are already bloated, as then you can have much more reading than writing going on.

The output from setting log_autovacuum_min_duration=0 could help distinguish between the various theories. Also, using pg_freespacemap to see how much sum(avail) PostgreSQL thinks the table has, while it is in a state of bloat, can be informative.

Table layout

This is an educated guess since I don't have complete information.

Don't use oid as column name. It's easy to confuse with the OID.
Don't use the name date for a timestamp column. Or rather: don't use the name date for any column, don't use names of base-types for identifiers at all. Can lead to confusing mistakes and error messages.
Create a lookup table for types and only put a small integer type_id into the big table. Pack fixed-length types tightly so not to waste space to padding. Details.
I prefer the type text (or varchar without length limit) over varchar(n). Details.

For example:

CREATE TABLE portal.inventory (
   inventory_id bigint PRIMARY KEY
  ,type_id      integer NOT NULL REFERENCES inv_type(type_id)
  ,pid          integer NOT NULL
  ,size         bigint NOT NULL
  ,ts           timestamp NOT NULL
  ,outline      geography(Polygon,4326)
  ,product_name text
  ,path         text
);

Postgresql – Improve performance on concurrent UPDATEs for a timestamp column in Postgres

Would a different kind of column be faster? For example an integer

No. timestamp and timestamptz are just unsigned 64-bit integers internally anyway.

Is there some way to not lock the column?

It doesn't lock the column. It takes weak table lock that doesn't really block anything except DDL, and takes a row level lock on the row you're updating.

There is no way to prevent the row level lock. It exists because without it behaviour and ordering concurrent updates would be undefined. We don't like undefined behaviour in RDBMSs.

It only blocks concurrent updates of the same row anyway.

Any other tips to improve this?

Not with the detail provided. There's likely a better way to do what you're trying to do, but it'll probably involve taking a few steps back and looking for a different strategy for solving the underlying problem.

In the specific case of cache invalidation I think you might want to look into LISTEN and NOTIFY. Again though, there just isn't enough info here to go on.

Best Answer

Related Solutions

PostgreSQL Index Tuning – Compound Index on Large Tables

Table layout

Postgresql – Improve performance on concurrent UPDATEs for a timestamp column in Postgres

Related Question