PostgreSQL auto-vacuum: “skipped frozen” pages causing massive bloating

dumppostgresql

I am having a problem with a table I am using for a forum where people can add their products.

Everytime a user loads a page, the "online timestamp" of the user gets updated (a PHP line prevents it from updating more than 1 time per hour), which triggers the "online timestamp" of his respective products to be updated as well (trigger). Users can add and edit products anytime. Here's the table definition:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  userid INT NOT NULL REFERENCES users(id),
  categoryid INT NOT NULL REFERENCES categories (id),
  regionid INT NOT NULL REFERENCES regions(id),
  title VARCHAR(255) NOT NULL,
  description TEXT NOT NULL,
  price DECIMAL(7,2) NOT NULL,
  create_timestamp BIGINT NOT NULL,
  modify_timestamp BIGINT NOT NULL,
  online_timestamp BIGINT NOT NULL
);

CREATE INDEX ON products (categoryid);
CREATE INDEX ON products (userid);
CREATE INDEX ON products (regionid);

There are around 100,000 users, with around 400,000 products. The table gets updated very often (around 2 times per second) due to the "online timestamp" needing to be updated.

Here's the problem: Every week, the database gets so bloated that I have to interrupt the website, dump the database to a .sql file, delete the database, and re-import the dump. The fresh database has a size of around 2 GB, and the bloated database can reach 80 – 100 GB before I have to dump + re-import. The WAL directory (pg_xlog) never goes over 1.1 GB so I don't think WAL has any issues.

Here's a typical query:

> EXPLAIN ANALYZE SELECT COUNT(*) FROM products WHERE categoryid = 4;
                                                                    QUERY PLAN                                                                
    ------------------------------------------------------------------------------------------------------------------------------------------
     Bitmap Heap Scan on products  (cost=6405.66..228100.41 rows=63642 width=1193) (actual time=6.100..26.484 rows=8913 loops=1)
       Recheck Cond: (categoryid = 132)
       Heap Blocks: exact=6802
       ->  Bitmap Index Scan on products_categoryid_idx  (cost=0.00..6389.74 rows=63642 width=0) (actual time=3.742..3.742 rows=8917 loops=1)
             Index Cond: (categoryid = 132)
             Heap Fetches: 900
     Planning time: 2.832 ms
     Execution time: 27.267 ms
    (7 rows)

In the above example, the "heap fetches" number increases as time pass by – at re-import time, the "heap fetches" is around 250,000 – 300,000. Needless to say, queries get slower and slower, and CPU usage increases.

Here's the weird thing in the logs with the vacuum happens:

2017-02-26 17:54:15.781 UTC > LOG:  automatic vacuum of table "my_db.public.products": index scans: 1
    pages: 0 removed, 2419432 remain, 1 skipped due to pins, 2173455 skipped frozen
    tuples: 170090 removed, 2553063 remain, 334 are dead but not yet removable
    buffer usage: 2871815 hits, 842794 misses, 131788 dirtied
    avg read rate: 42.389 MB/s, avg write rate: 6.628 MB/s
    system usage: CPU 6.05s/9.79u sec elapsed 155.33 sec

We can see the "skipped frozen" in the logs above, which indicated that it's not really getting vacuumed. Despite my research, I couldn't find anything about "frozen pages".

I'm unable to come with a solution fpr this. Here's my config (obviously, I'm using the latest version):

# -----------------------------
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The "=" is optional.)  Whitespace may be used.  Comments are introduced with
# "#" anywhere on a line.  The complete list of parameter names and allowed
# values can be found in the PostgreSQL documentation.
#
# The commented-out settings shown in this file represent the default values.
# Re-commenting a setting is NOT sufficient to revert it to the default value;
# you need to reload the server.
#
# This file is read on server startup and when the server receives a SIGHUP
# signal.  If you edit the file on a running system, you have to SIGHUP the
# server for the changes to take effect, or use "pg_ctl reload".  Some
# parameters, which are marked below, require a server shutdown and restart to
# take effect.
#
# Any parameter can also be given as a command-line option to the server, e.g.,
# "postgres -c log_connections=on".  Some parameters can be changed at run time
# with the "SET" SQL command.
#
# Memory units:  kB = kilobytes        Time units:  ms  = milliseconds
#                MB = megabytes                     s   = seconds
#                GB = gigabytes                     min = minutes
#                TB = terabytes                     h   = hours
#                                                   d   = days


#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

# The default values of these variables are driven from the -D command-line
# option or PGDATA environment variable, represented here as ConfigDir.

#data_directory = 'ConfigDir'       # use data in another directory
                    # (change requires restart)
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
                    # (change requires restart)
#ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file
                    # (change requires restart)

# If external_pid_file is not explicitly set, no extra PID file is written.
#external_pid_file = ''         # write an extra PID file
                    # (change requires restart)


#------------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#------------------------------------------------------------------------------

# - Connection Settings -

#listen_addresses = 'localhost'     # what IP address(es) to listen on;
                    # comma-separated list of addresses;
                    # defaults to 'localhost'; use '*' for all
                    # (change requires restart)
#port = 5432                # (change requires restart)
max_connections = 500           # (change requires restart)
#superuser_reserved_connections = 3 # (change requires restart)
#unix_socket_directories = '/var/run/postgresql, /tmp'  # comma-separated list of directories
                    # (change requires restart)
#unix_socket_group = ''         # (change requires restart)
#unix_socket_permissions = 0777     # begin with 0 to use octal notation
                    # (change requires restart)
#bonjour = off              # advertise server via Bonjour
                    # (change requires restart)
#bonjour_name = ''          # defaults to the computer name
                    # (change requires restart)

# - Security and Authentication -

#authentication_timeout = 1min      # 1s-600s
#ssl = off              # (change requires restart)
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
                    # (change requires restart)
#ssl_prefer_server_ciphers = on     # (change requires restart)
#ssl_ecdh_curve = 'prime256v1'      # (change requires restart)
#ssl_cert_file = 'server.crt'       # (change requires restart)
#ssl_key_file = 'server.key'        # (change requires restart)
#ssl_ca_file = ''           # (change requires restart)
#ssl_crl_file = ''          # (change requires restart)
#password_encryption = on
#db_user_namespace = off
#row_security = on

# GSSAPI using Kerberos
#krb_server_keyfile = ''
#krb_caseins_users = off

# - TCP Keepalives -
# see "man 7 tcp" for details

#tcp_keepalives_idle = 0        # TCP_KEEPIDLE, in seconds;
                    # 0 selects the system default
#tcp_keepalives_interval = 0        # TCP_KEEPINTVL, in seconds;
                    # 0 selects the system default
#tcp_keepalives_count = 0       # TCP_KEEPCNT;
                    # 0 selects the system default


#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------

# - Memory -

shared_buffers = 64GB           # min 128kB
                    # (change requires restart)
huge_pages = try            # on, off, or try
                    # (change requires restart)
#temp_buffers = 8MB         # min 800kB
max_prepared_transactions = 0       # zero disables the feature
                    # (change requires restart)
# Caution: it is not advisable to set max_prepared_transactions nonzero unless
# you actively intend to use prepared transactions.
work_mem = 8MB              # min 64kB
maintenance_work_mem = 512MB        # min 1MB
#replacement_sort_tuples = 150000   # limits use of replacement selection sort
autovacuum_work_mem = -1        # min 1MB, or -1 to use maintenance_work_mem
max_stack_depth = 8MB           # min 100kB
dynamic_shared_memory_type = posix  # the default is the first option
                    # supported by the operating system:
                    #   posix
                    #   sysv
                    #   windows
                    #   mmap
                    # use none to disable dynamic shared memory

# - Disk -

temp_file_limit = -1            # limits per-process temp file space
                    # in kB, or -1 for no limit

# - Kernel Resource Usage -

#max_files_per_process = 1000       # min 25
                    # (change requires restart)
#shared_preload_libraries = ''      # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0          # 0-100 milliseconds
#vacuum_cost_page_hit = 1       # 0-10000 credits
#vacuum_cost_page_miss = 10     # 0-10000 credits
#vacuum_cost_page_dirty = 20        # 0-10000 credits
#vacuum_cost_limit = 200        # 1-10000 credits

# - Background Writer -

#bgwriter_delay = 200ms         # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100        # 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0      # 0-10.0 multiplier on buffers scanned/round
#bgwriter_flush_after = 512kB       # measured in pages, 0 disables

# - Asynchronous Behavior -

#effective_io_concurrency = 1       # 1-1000; 0 disables prefetching
#max_worker_processes = 8       # (change requires restart)
#max_parallel_workers_per_gather = 0    # taken from max_worker_processes
#old_snapshot_threshold = -1        # 1min-60d; -1 disables; 0 is immediate
                    # (change requires restart)
#backend_flush_after = 0        # measured in pages, 0 disables


#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------

# - Settings -

#wal_level = minimal            # minimal, replica, or logical
                    # (change requires restart)
#fsync = on             # flush data to disk for crash safety
                        # (turning this off can cause
                        # unrecoverable data corruption)
#synchronous_commit = on        # synchronization level;
                    # off, local, remote_write, remote_apply, or on
#wal_sync_method = fsync        # the default is the first option
                    # supported by the operating system:
                    #   open_datasync
                    #   fdatasync (default on Linux)
                    #   fsync
                    #   fsync_writethrough
                    #   open_sync
#full_page_writes = on          # recover from partial page writes
#wal_compression = off          # enable compression of full-page writes
#wal_log_hints = off            # also do full page writes of non-critical updates
                    # (change requires restart)
#wal_buffers = -1           # min 32kB, -1 sets based on shared_buffers
                    # (change requires restart)
#wal_writer_delay = 200ms       # 1-10000 milliseconds
#wal_writer_flush_after = 1MB       # measured in pages, 0 disables

#commit_delay = 0           # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000

# - Checkpoints -

#checkpoint_timeout = 5min      # range 30s-1d
max_wal_size = 1GB
min_wal_size = 80MB
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
#checkpoint_flush_after = 256kB     # measured in pages, 0 disables
#checkpoint_warning = 30s       # 0 disables

# - Archiving -

#archive_mode = off     # enables archiving; off, on, or always
                # (change requires restart)
#archive_command = ''       # command to use to archive a logfile segment
                # placeholders: %p = path of file to archive
                #               %f = file name only
                # e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
#archive_timeout = 0        # force a logfile segment switch after this
                # number of seconds; 0 disables


#------------------------------------------------------------------------------
# REPLICATION
#------------------------------------------------------------------------------

# - Sending Server(s) -

# Set these on the master and on any standby that will send replication data.

#max_wal_senders = 0        # max number of walsender processes
                # (change requires restart)
#wal_keep_segments = 0      # in logfile segments, 16MB each; 0 disables
#wal_sender_timeout = 60s   # in milliseconds; 0 disables

#max_replication_slots = 0  # max number of replication slots
                # (change requires restart)
#track_commit_timestamp = off   # collect timestamp of transaction commit
                # (change requires restart)

# - Master Server -

# These settings are ignored on a standby server.

#synchronous_standby_names = '' # standby servers that provide sync rep
                # number of sync standbys and comma-separated list of application_name
                # from standby(s); '*' = all
#vacuum_defer_cleanup_age = 0   # number of xacts by which cleanup is delayed

# - Standby Servers -

# These settings are ignored on a master server.

#hot_standby = off          # "on" allows queries during recovery
                    # (change requires restart)
#max_standby_archive_delay = 30s    # max delay before canceling queries
                    # when reading WAL from archive;
                    # -1 allows indefinite delay
#max_standby_streaming_delay = 30s  # max delay before canceling queries
                    # when reading streaming WAL;
                    # -1 allows indefinite delay
#wal_receiver_status_interval = 10s # send replies at least this often
                    # 0 disables
#hot_standby_feedback = off     # send info from standby to prevent
                    # query conflicts
#wal_receiver_timeout = 60s     # time that receiver waits for
                    # communication from master
                    # in milliseconds; 0 disables
#wal_retrieve_retry_interval = 5s   # time to wait before retrying to
                    # retrieve WAL after a failed attempt


#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_indexonlyscan = on
#enable_material = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0            # measured on an arbitrary scale
#random_page_cost = 4.0         # same scale as above
#cpu_tuple_cost = 0.01          # same scale as above
#cpu_index_tuple_cost = 0.005       # same scale as above
#cpu_operator_cost = 0.0025     # same scale as above
#parallel_tuple_cost = 0.1      # same scale as above
#parallel_setup_cost = 1000.0   # same scale as above
#min_parallel_relation_size = 8MB
effective_cache_size = 32GB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5            # range 1-10
#geqo_pool_size = 0         # selects default based on effort
#geqo_generations = 0           # selects default based on effort
#geqo_selection_bias = 2.0      # range 1.5-2.0
#geqo_seed = 0.0            # range 0.0-1.0

# - Other Planner Options -

#default_statistics_target = 100    # range 1-10000
#constraint_exclusion = partition   # on, off, or partition
#cursor_tuple_fraction = 0.1        # range 0.0-1.0
#from_collapse_limit = 8
#join_collapse_limit = 8        # 1 disables collapsing of explicit
                    # JOIN clauses
#force_parallel_mode = off


#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------

# - Where to Log -

log_destination = 'stderr'      # Valid values are combinations of
                    # stderr, csvlog, syslog, and eventlog,
                    # depending on platform.  csvlog
                    # requires logging_collector to be on.

# This is used when logging to stderr:
logging_collector = on          # Enable capturing of stderr and csvlog
                    # into log files. Required to be on for
                    # csvlogs.
                    # (change requires restart)

# These are only used if logging_collector is on:
log_directory = 'pg_log'        # directory where log files are written,
                    # can be absolute or relative to PGDATA
log_filename = 'postgresql-%a.log'  # log file name pattern,
                    # can include strftime() escapes
#log_file_mode = 0600           # creation mode for log files,
                    # begin with 0 to use octal notation
log_truncate_on_rotation = on       # If on, an existing log file with the
                    # same name as the new log file will be
                    # truncated rather than appended to.
                    # But such truncation only occurs on
                    # time-driven rotation, not on restarts
                    # or size-driven rotation.  Default is
                    # off, meaning append to existing files
                    # in all cases.
log_rotation_age = 1d           # Automatic rotation of logfiles will
                    # happen after that time.  0 disables.
log_rotation_size = 0           # Automatic rotation of logfiles will
                    # happen after that much log output.
                    # 0 disables.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
#syslog_sequence_numbers = on
#syslog_split_messages = on

# This is only relevant when logging to eventlog (win32):
#event_source = 'PostgreSQL'

# - When to Log -

#client_min_messages = notice       # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   log
                    #   notice
                    #   warning
                    #   error

#log_min_messages = warning     # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic

#log_min_error_statement = error    # values in order of decreasing detail:
                    #   debug5
                    #   debug4
                    #   debug3
                    #   debug2
                    #   debug1
                    #   info
                    #   notice
                    #   warning
                    #   error
                    #   log
                    #   fatal
                    #   panic (effectively off)

#log_min_duration_statement = -1    # -1 is disabled, 0 logs all statements
                    # and their durations, > 0 logs only
                    # statements running at least this number
                    # of milliseconds


# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = on
#log_checkpoints = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_error_verbosity = default      # terse, default, or verbose messages
#log_hostname = off
log_line_prefix = '< %m > '         # special values:
                    #   %a = application name
                    #   %u = user name
                    #   %d = database name
                    #   %r = remote host and port
                    #   %h = remote host
                    #   %p = process ID
                    #   %t = timestamp without milliseconds
                    #   %m = timestamp with milliseconds
                    #   %n = timestamp with milliseconds (as a Unix epoch)
                    #   %i = command tag
                    #   %e = SQL state
                    #   %c = session ID
                    #   %l = session line number
                    #   %s = session start timestamp
                    #   %v = virtual transaction ID
                    #   %x = transaction ID (0 if none)
                    #   %q = stop here in non-session
                    #        processes
                    #   %% = '%'
                    # e.g. '<%u%%%d> '
#log_lock_waits = off           # log lock waits >= deadlock_timeout
#log_statement = 'none'         # none, ddl, mod, all
#log_replication_commands = off
#log_temp_files = -1            # log temporary files equal or larger
                    # than the specified size in kilobytes;
                    # -1 disables, 0 logs all temp files
log_timezone = 'UTC'


# - Process Title -

#cluster_name = ''          # added to process titles if nonempty
                    # (change requires restart)
#update_process_title = on


#------------------------------------------------------------------------------
# RUNTIME STATISTICS
#------------------------------------------------------------------------------

# - Query/Index Statistics Collector -

#track_activities = on
#track_counts = on
#track_io_timing = off
#track_functions = none         # none, pl, all
#track_activity_query_size = 1024   # (change requires restart)
#stats_temp_directory = 'pg_stat_tmp'


# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off


#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------

autovacuum = on             # Enable autovacuum subprocess?  'on'
                    # requires track_counts to also be on.
log_autovacuum_min_duration = 0     # -1 disables, 0 logs all actions and
                    # their durations, > 0 logs only
                    # actions running at least this number
                    # of milliseconds.
autovacuum_max_workers = 6      # max number of autovacuum subprocesses
                    # (change requires restart)
autovacuum_naptime = 60s        # time between autovacuum runs
autovacuum_vacuum_threshold = 25    # min number of row updates before
                    # vacuum
autovacuum_analyze_threshold = 10   # min number of row updates before
                    # analyze
autovacuum_vacuum_scale_factor = 0.1    # fraction of table size before vacuum
autovacuum_analyze_scale_factor = 0.05  # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum
                    # (change requires restart)
#autovacuum_multixact_freeze_max_age = 400000000    # maximum multixact age
                    # before forced vacuum
                    # (change requires restart)
autovacuum_vacuum_cost_delay = 10ms # default vacuum cost delay for
                    # autovacuum, in milliseconds;
                    # -1 means use vacuum_cost_delay
autovacuum_vacuum_cost_limit = 1000 # default vacuum cost limit for
                    # autovacuum, -1 means use
                    # vacuum_cost_limit


#------------------------------------------------------------------------------
# CLIENT CONNECTION DEFAULTS
#------------------------------------------------------------------------------

# - Statement Behavior -

#search_path = '"$user", public'    # schema names
#default_tablespace = ''        # a tablespace name, '' uses the default
#temp_tablespaces = ''          # a list of tablespace names, '' uses
                    # only default tablespace
#check_function_bodies = on
#default_transaction_isolation = 'read committed'
#default_transaction_read_only = off
#default_transaction_deferrable = off
#session_replication_role = 'origin'
#statement_timeout = 0          # in milliseconds, 0 is disabled
#lock_timeout = 0           # in milliseconds, 0 is disabled
#idle_in_transaction_session_timeout = 0        # in milliseconds, 0 is disabled
#vacuum_freeze_min_age = 50000000
#vacuum_freeze_table_age = 150000000
#vacuum_multixact_freeze_min_age = 5000000
#vacuum_multixact_freeze_table_age = 150000000
#bytea_output = 'hex'           # hex, escape
#xmlbinary = 'base64'
#xmloption = 'content'
#gin_fuzzy_search_limit = 0
#gin_pending_list_limit = 4MB

# - Locale and Formatting -

datestyle = 'iso, mdy'
#intervalstyle = 'postgres'
timezone = 'UTC'
#timezone_abbreviations = 'Default'     # Select the set of available time zone
                    # abbreviations.  Currently, there are
                    #   Default
                    #   Australia (historical usage)
                    #   India
                    # You can create your own file in
                    # share/timezonesets/.
#extra_float_digits = 0         # min -15, max 3
#client_encoding = sql_ascii        # actually, defaults to database
                    # encoding

# These settings are initialized by initdb, but they can be changed.
lc_messages = 'en_US.UTF-8'         # locale for system error message
                    # strings
lc_monetary = 'en_US.UTF-8'         # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'          # locale for number formatting
lc_time = 'en_US.UTF-8'             # locale for time formatting

# default configuration for text search
default_text_search_config = 'pg_catalog.english'

# - Other Defaults -

#dynamic_library_path = '$libdir'
#local_preload_libraries = ''
#session_preload_libraries = ''


#------------------------------------------------------------------------------
# LOCK MANAGEMENT
#------------------------------------------------------------------------------

#deadlock_timeout = 1s
#max_locks_per_transaction = 64     # min 10
                    # (change requires restart)
#max_pred_locks_per_transaction = 64    # min 10
                    # (change requires restart)


#------------------------------------------------------------------------------
# VERSION/PLATFORM COMPATIBILITY
#------------------------------------------------------------------------------

# - Previous PostgreSQL Versions -

#array_nulls = on
#backslash_quote = safe_encoding    # on, off, or safe_encoding
#default_with_oids = off
#escape_string_warning = on
#lo_compat_privileges = off
#operator_precedence_warning = off
#quote_all_identifiers = off
#sql_inheritance = on
#standard_conforming_strings = on
#synchronize_seqscans = on

# - Other Platforms and Clients -

#transform_null_equals = off


#------------------------------------------------------------------------------
# ERROR HANDLING
#------------------------------------------------------------------------------

#exit_on_error = off            # terminate session on any error?
#restart_after_crash = on       # reinitialize after backend crash?


#------------------------------------------------------------------------------
# CONFIG FILE INCLUDES
#------------------------------------------------------------------------------

# These options allow settings to be loaded from files other than the
# default postgresql.conf.

#include_dir = 'conf.d'         # include files ending in '.conf' from
                    # directory 'conf.d'
#include_if_exists = 'exists.conf'  # include file only if it exists
#include = 'special.conf'       # include file


#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------

# Add settings for extensions here

My server has 256 GB of RAM. I have also tried temporarly deactivating the timestamp update trigger, but I still get that "frozen" message. Anybody has a clue what is going on?

Best Answer

Those "frozen" messages refer to pages that are known from the visibility map's freeze bits to contain only frozen tuples. A frozen tuple is one that all current and future transactions can "see", i.e. it committed before the start of any still-running xact. These shouldn't be removed, and are not the problem; they're the data that doesn't change.

Vacuum is doing useful work, see:

tuples: 170090 removed ...

It isn't truncating the tables (removing pages), but it doesn't need to... and it shouldn't. Because your load pattern means you'll always need at least twice the real data space to keep track of dead rows from all your UPDATE churn.

However, the degree of bloat you encounter is surprising. It's clear that something isn't working right. Maybe you found an issue in the new freeze map code, but I'd be looking for other explanations first.

Do you have lots of long-running transactions? If your app fails to close transactions you'll see lots of messages about "non-removable" rows.

Can you supply an EXPLAIN (BUFFERS, ANALYZE, VERBOSE) from a problem query when the system is getting bloated? As well as a VACUUM (FULL, VERBOSE) on one or more problem tables? (This will lock the table while it's rewriting it).

If you enable autovacuum logging, does anything interesting show up?

Is it tables or indexes that get bloated? Just a couple of tables or all of them equally? Just the busy ones? Some indexes worse than others? Etc. Look into this.


Separately, your design is pretty bad TBH. You should really move that activity data to a pair of side-tables that joins on the user and product tables respectively. Update the activity row there, so you're not constantly rewriting your main tables full of data that doesn't change much. It'll help with disk I/O, cache hit rates, and more.