Did Adding JSONB Indexes Bloat the PostgreSQL Database?

indexjsonpostgresql

We are running:

user@host:~$ psql -d database -c "SELECT version();"                                                                   
version
---------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 10.7 (Ubuntu 10.7-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
(1 row)

on:

user@host:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:    16.04
Codename:   xenial

and have the following setup:

database=# \d+ schema.table
                                                                  Table "schema.table"
           Column            |            Type             | Collation | Nullable |                     Default                     | Storage  | Stats target | Description
-----------------------------+-----------------------------+-----------+----------+-------------------------------------------------+----------+--------------+-------------
 column_1                    | bigint                      |           | not null | nextval('table_id_seq'::regclass)               | plain    |              |
 column_2                    | character varying           |           | not null |                                                 | extended |              |
 column_3                    | character varying           |           | not null |                                                 | extended |              |
 column_4                    | character varying           |           | not null |                                                 | extended |              |
 column_5                    | timestamp without time zone |           | not null |                                                 | plain    |              |
 column_6                    | timestamp without time zone |           |          |                                                 | plain    |              |
 column_7                    | character varying           |           | not null |                                                 | extended |              |
 column_8                    | jsonb                       |           | not null |                                                 | extended |              |
 column_9                    | jsonb                       |           |          |                                                 | extended |              |
 column_10                   | character varying           |           | not null |                                                 | extended |              |
 column_11                   | character varying           |           | not null |                                                 | extended |              |
 column_12                   | character varying           |           |          |                                                 | extended |              |
 column_13                   | character varying           |           |          |                                                 | extended |              |
 column_14                   | timestamp with time zone    |           | not null |                                                 | plain    |              |
 column_15                   | timestamp with time zone    |           | not null |                                                 | plain    |              |
Indexes:
    "table_pkey" PRIMARY KEY, btree ( column_1 )
    "table_idx_1" btree ( column_11)
    "table_idx_2" btree ( column_4, column_2, column_7, column_5, column_6 )
    "table_idx_3" btree ( column_7, column_11, column_15 )
    "table_idx_4" btree ( column_7, column_11, column_14 )
    "table_idx_5" btree ( column_7, column_11, column_5 )
    "table_idx_6" btree ( column_7, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER ), column_5 )
    "table_idx_7" btree ( column_15 )
    "table_idx_8" btree ( column_4, column_2, column_7, column_5, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER ) )
    "table_idx_9" btree ( column_4, column_2, column_7, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER) )
    "table_idx_a" btree ( column_7, column_4, column_2, ( ( column_8 ->> 'string_1'::TEXT )::INTEGER), ( ( column_8 ->> 'string_2'::TEXT )::INTEGER ) ) WHERE column_7::TEXT = 'string_3'::TEXT
Check constraints:
    "table_check_constraints" CHECK ( lower( column_10::TEXT ) <> 'string_4'::TEXT OR column_9 IS NOT NULL AND column_6 IS NOT NULL )

Autovacuum is on and configured with:

user@host:~$ psql -d database -c "SELECT name, setting, pending_restart FROM pg_settings WHERE NAME ILIKE '%autovacuum%' ORDER BY name;"
                name                 |  setting              | pending_restart
-------------------------------------+-----------------------+-----------------
 autovacuum                          | on                    | f
 autovacuum_analyze_scale_factor     | 0.002                 | f
 autovacuum_analyze_threshold        | 10                    | f
 autovacuum_freeze_max_age           | 200000000             | f
 autovacuum_max_workers              | 5                     | f
 autovacuum_multixact_freeze_max_age | 400000000             | f
 autovacuum_naptime                  | 30                    | f
 autovacuum_vacuum_cost_delay        | 10                    | f
 autovacuum_vacuum_cost_limit        | 1000                  | f
 autovacuum_vacuum_scale_factor      | 0.001                 | f
 autovacuum_vacuum_threshold         | 25                    | f
 autovacuum_work_mem                 | -1                    | f
 log_autovacuum_min_duration         | 0 (env 1) /-1 (env 2) | f
(13 rows)

The following sequence of events took place in environment 1, during which autovacuum was on and configured as above:

  1. Nightly VACUUM (VERBOSE, ANALYZE) of database added.
  2. Some time passes during which bloat is at normal operational level.
  3. Nightly VACUUM (VERBOSE, ANALYZE) of database is removed.
  4. Index table_idx_8 that includes a JSONB data type column is added.
  5. Index table_idx_9 that includes a JSONB data type column is added.
  6. Bloat growth spurt starts and continues for 2 days until it peaks.
  7. VACUUM (VERBOSE, FULL) of table.
  8. Bloat returns to normal operational levels and stays there.

The database size (GB) looked like this in environment 1 during this sequence of events:

environment 1 DB size (GB)

And this is what bloat (GB) looked like in environment 1:

environment 1 DB bloat (GB)

The number of live rows in environment 1:

environment 1 number of live rows

The number of dead rows in environment 1:

environment 1 number of dead rows

The following sequence of events took place in environment 2, during all of which autovacuum was on and configured as above:

  1. Nightly VACUUM (VERBOSE, ANALYZE) of database added.
  2. Some time passes during which bloat is at normal operational level.
  3. Nightly VACUUM (VERBOSE, ANALYZE) of database is removed.
  4. Index table_idx_8 that includes a JSONB data type column is added.
  5. Index table_idx_9 that includes a JSONB data type column is added.
  6. Bloat growth spurt starts and continues for 2 days until it peaks and brings DB down (disk full).
  7. TRUNCATE TABLE schema.table.
  8. The schema.table table is populated again.
  9. Bloat does not stabilise and grows until it peaks again.
  10. TRUNCATE TABLE schema.table before the disk fills up again.
  11. VACUUM (VERBOSE, FULL) of database.
  12. The schema.table table is populated again.
  13. Bloat continues to grow!

The database size (GB) in environment 2 looked like this during this sequence of events :

enter image description here

And this is what bloat (GB) looked like in environment 2:

enter image description here

The only difference between these two environments is that they are speced slightly differently (with 2 being less powerful). During these sequences of events, write/read volumes were unchanged on each environment. We are using this query to measure the bloat in bytes.

I have cross checked PostgreSQL logs, monitoring logs and commit logs (Git) and identified the addition of the two indexes as the trigger for the bloat but:

  1. Is that right? Can adding an index trigger such a bloat growth spurt?
  2. Why did adding the indexes trigger the bloat, if it did?
  3. Why did environment 1 stabilise and environment 2 not?
  4. How can we stabilise environment 2?

Any help with answering these questions would be greatly appreciated and needless to say, I'm happy to provide any other information that I have missed that could be helpful.

Best Answer

I don't think the archeological approach will be very useful here. There is just too much missing info and confounding variables. For example, people usually don't just add indexes for no reason. If a change in work load motivated the index creation, it could be the change in workload, independent of the index, which is causing the bloat.

There are a lot of theories that explain what you see, but really no way to distinguish between them based on the history given. Each index gives vacuum more work to do, so your new indexes might have just pushed it over the tipping point just because it was close already, without regard to what the content of the indexes are. Or maybe a lot of work piled up while the table was locked for index creation, and then the frenzy of activity once the lock was released pushed it over the edge. It is isn't just more indexes that create more work for vacuum--bloat does as well. This can lead to a vicious cycle where more bloat slows down the vacuum, leading to more bloat yet. That could be the reason environment 1 stabilized after the VACUUM FULL, it broke the vicious cycle to the point regular vacuums can now keep up. If environment 2 is less powerful, the tipping point might be lower and the VACUUM FULL there might leave it still beyond that point.

 autovacuum_analyze_scale_factor     | 0.002                 | f
 autovacuum_analyze_threshold        | 10                    | f
 autovacuum_vacuum_scale_factor      | 0.001                 | f
 autovacuum_vacuum_threshold         | 25                    | f

These settings seem rather ridiculous at first glance. Is there a rationale for them? It could be spending so much time vacuuming and analyzing tables that don't really need them, that it can't keep up on the table that does need vacuuming (but if you have only one large table, that might not be much of a concern). Lowering the scale factors can make sense, but generally only in conjunction with an increase of the thresholds.

I routinely set "vacuum_cost_page_hit" to zero "vacuum_cost_page_miss" to zero. In my experience, the concurrent performance problems caused by autovac are generally caused by the writing, not the reading, so there is no point in throttling on the read side. This could be especially important when you have tables and indexes which are already bloated, as then you can have much more reading than writing going on.

The output from setting log_autovacuum_min_duration=0 could help distinguish between the various theories. Also, using pg_freespacemap to see how much sum(avail) PostgreSQL thinks the table has, while it is in a state of bloat, can be informative.