Postgresql 9.6 best settings for large admin work (min_wal and max_wal)

maintenanceperformancepostgresqlpostgresql-9.6postgresql-performance

We have a server that's disconnected from the world. It's a high end system with 48GB memory and 500GB SSD hard disk, 16 core CPUs. We're trying to do a pg_restore of a simple database dump of less than 10 tables, no binary data or blobs, just simple text (a comments system). But one table has about 200GB of data, so it's large.

There's no other jobs for this DB. Only this maintenance task. What are the best settings for this purpose given the config above? PGSQL's documentation does not help me too much. My specific question is around wal settings.

If we're OK with using the whole of this server just to do a pg_restore, and nothing else other than PG is on this server, what settings should we use? This is what we have now:

maintenance_work_mem = 1500MB 
fsync = off
synchronous_commit = off
wal_level = minimal
full_page_writes = off
wal_buffers = 64MB
#-----  checkpoint_segments = 512
#-----  max_wal_size = (3 * checkpoint_segments) * 16MB
#-- min_wal_size = 100MB    # 80MB is the default
max_wal_size = 24576MB   # based on 512 checkpoint_segments 
max_wal_senders = 0
wal_keep_segments = 0
archive_mode = off
autovacuum = off

Note that using top, we find that memory is not the issue. The CPU cores are spiking to 100 or so, and then dipping. This is an intensive write process, so that makes sense. Welcome any simple-to-understand guidance on how the min_wal_size should be set — note that it's commented for us now.

Best Answer

I guess you only want to test if the restoration of the dump works and nothing more, which means you can do some unsafe configuration changes. Let's first start with your settings.

These one are a good call:

fsync = off
synchronous_commit = off
full_page_writes = off
wal_level = minimal
autovacuum = off

These three are only important if you are using replication, and because you already set wal_level to minimum, you are not using it, so they are not important:

wal_keep_segments = 0
archive_mode = off
max_wal_senders = 0

You have a lot of RAM which won't be used for anything, I would bump this one up even more:

maintenance_work_mem = 3GB

I would leave wal_buffers to it's default:

wal_buffers = -1

and bump shared_buffers (wall_buffers will be automatically calculated):

shared_buffers: 4GB

What you should try to concentrate is to have no, or as less checkpoints as you can during your restore. Checkpoints are controlled by max_wal_size and checkpoint_timeout. First of bump checkpoint_timeout to something like 20h, so that a timed checkpoint never happens while you restore:

checkpoint_timeout = 20h

Then you can set max_wal_size to a value as high as your disk space allows you. If your restored DB is 200GB and your disk 500GB, you should be safe to set max_wal_size to 100GB because Postgres can store up to two checkpoints of wals (which is 2xmax_wal_size):

max_wal_size = 100GB

min_wal_size won't matter that much in your case, but you can probably bump it to 10GB

min_wal_size = 10GB

I would also recommend that you use pg_restore with --jobs=NUM where NUM would probably be the number of CPU cores, but it also depends on the speed of your SSD, so you can play with this parameter.

Besides Postgres settings, I would also recommend that if possible you add an additional SATA drive (7200RPM will be fine) and symlink pg_wal directory to that SATA disk. That is the directory where Postgres saves WALs, and because they are written in append, SATA is fast enough for them. It will reduced the load on the SSD, but will also mean you will be able to bump max_wal_size even more (depending on the size of the SATA disk).

Finally don't forget to restore your settings to the sane values after the dump is restored.

Index

First and foremost, for your type of query this is the much better index:

CREATE INDEX de_tt_priceinfo_received_station_id_idx
  ON public.de_tt_priceinfo (station_id, received);  -- note the reversed order

Since the combination is supposed to be unique (I assume), I suggest a UNIQUE constraint on (station_id, receved) instead:

ALTER TABLE de_tt_priceinfo ADD CONSTRAINT de_tt_priceinfo_station_id_received
UNIQUE (station_id, received);

The index index_station_id is mostly superseded and can probably be dropped now.
The index de_tt_priceinfo_received_station_id_idx may still have its use.

Be sure to understand the logic behind all this:

Query

I would also consider the basic DISTINCT ON query:

SELECT DISTINCT ON (station_id)
       station_id, e5, e10, diesel, received
FROM   de_tt_priceinfo
WHERE  received <= '2014-09-25 08:45:12'::TIMESTAMPTZ
AND    station_id = ANY ('{0C91A93A-a-b-c-d, 578C44BB-a-b-c-d, 6F2F48A8-a-b-c-d
                         , 9982BE74-a-b-c-d, A24C612B-a-b-c-d, BEC3EF55-a-b-c-d
                         , F5137488-a-b-c-d}'::varchar[])
ORDER BY station_id, received DESC;

But since you seem to have a lot of rows per station, that's not going to shine. Instead:

SELECT *
FROM  (
   VALUES
     ('0C91A93A-a-b-c-d'::varchar)
    , ('578C44BB-a-b-c-d')
    , ('6F2F48A8-a-b-c-d')
    , ('9982BE74-a-b-c-d')
    , ('A24C612B-a-b-c-d')
    , ('BEC3EF55-a-b-c-d')
    , ('F5137488-a-b-c-d')
   ) s(station_id)
LEFT JOIN LATERAL (
    SELECT e5, e10, diesel, received
    FROM   de_tt_priceinfo
    WHERE  station_id = s.station_id
    AND    received <= '2014-09-25 08:45:12'::TIMESTAMPTZ
    ORDER  BY received DESC
    LIMIT  1
   )  p ON TRUE

This one should be dynamite in combination with above UNIQUE constraint (or an equivalent index).

Detailed explanation:

Table definition

For a table with millions of rows it pays to optimize storage while easily possible. Makes everything smaller and faster.

That's how I would design it:

CREATE TABLE station (
   station_id serial PRIMARY KEY
 , station    text
 , CHECK (length(station) < 61) -- ?? optional, you decide 
);

CREATE TABLE priceinfo (
   priceinfo_id serial PRIMARY KEY
 , station_id   integer NOT NULL REFERENCES station ON UPDATE CASCADE
 , received     timestamptz NOT NULL DEFAULT now(),
 , e5           integer  -- price in 0.1 Cent
 , e10          integer  -- price in 0.1 Cent
 , diesel       integer  -- price in 0.1 Cent
 , CONSTRAINT priceinfo_station_id_received UNIQUE (station_id, received)
);

CREATE INDEX priceinfo_received_idx ON public.priceinfo (received);

The row size in priceinfo would be 60 bytes (24 heap tuple header + null bitmap; 32 bytes data; 4 bytes item identifier), as compared to 94 bytes (24 + 66 + 4) in your original table. That's assuming 16-character string like in your example. Everything will be ~ 36 % smaller (or more?) and considerably faster.

The crucial index on (station_id, received) is down to 8 bytes of data per index tuple instead of 32 bytes or even much more (!) - each plus overhead. In addition, handling integer numbers for station_id is generally faster than text with a COLLATION on top of it.

Details:

Configuring PostgreSQL for read performance

Query would fetch station_id from station table first, which is cheap.

Prices are stored as integer numbers signifying 0.1 Cent. (4 bytes instead of 10 bytes for your original numeric(4,3) Multiply with 0.1 to get Cent or 0.001 to get € for display. Very simple and fast.

`UUID`

The string in the error message looks considerably longer and actually like a regular UUID number:

871828b4-37e5-419c-b7a5-cdbe1e1c0148

If so, use the uuid data type. Whether you adopt my design of keep your old. At least switch to the uuid data type for a big overall gain in every aspect:

Would index lookup be noticeably faster with char vs varchar when all values are 36 chars

Best Answer

Related Solutions

Postgresql – Optimizing ORDER BY in a full text search query

PostgreSQL – Finding Current Prices for Fuel Stations at a Specific Time

Index

Query

Table definition

UUID

Related Question

`UUID`