Postgresql – postgres performance tuning – what to try next

performancepostgresqlpostgresql-performance

I have a simple (but large) postgres database.

I have a table defined like this….

CREATE TABLE TABLE_A (
  MessID      SERIAL UNIQUE,
  rawData     CHAR(250),               
  loadedTime  TIMESTAMP)

And a process that continuously loads millions of records (each one a variable length ASCII string) into the table. There’s an index on loadedTime.
That’s it, no triggers, nothing fancy. All day records get loaded in, at night they get processed. The night-time processing only takes a few hours, and since I’ve got all night I’m not worried about the performance then.

What’s is of critical importance is the rate at which I can insert records during the day.

I’ve managed to get the time down from about 10ms/record to about 1.5m/s record. I’d like improve performance further (maybe 1 ms/record? Or quicker?)

The following changes have made improvements….

  • Partitioned my data by date/time range.
  • Kept ‘current’ data on an SSD tablespace, and ‘archive’ data on a magnetic drive
  • Tweaked both the number of records added in a single INSERT statement, and the number of INSERTs in each transaction

I’ve also tried tweaking postgres.conf using pgtune.leopard.in.ua. To be honest, none of the changes to this file have made any noticeable difference. The current settings are…

# DB Version: 10
# OS Type: windows
# DB Type: oltp
# Total Memory (RAM): 16 GB
# CPUs num: 1
# Connections num: 10
# Data Storage: ssd

max_connections = 10
shared_buffers = 512MB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
work_mem = 541764kB
min_wal_size = 2GB
max_wal_size = 4GB

My questions are…

  1. Why haven’t the hardware and config changes made any noticeable difference?
  2. Are there any other areas for performance tuning I should be looking at?
  3. If I were to upgrade the server the database is on, what factors should I look for in new one – bearing in mind that increasing the memory doesn’t seem to make the slightest bit of difference.

Other information that may be relevant.

  • I’m only interested in the speed at which records get inserted during the day.
  • During daytime operations there is a single connection to the database (doing the INSERTs) and no other processing.
  • The existing server is a 16 GByte Windows pro single processor with Postgres 10 installed on an SSD and archive tablespaces on a magnetic drive.
  • TABLE_A is partitioned by day. Todays data is held on a tablespace on the SSD, at night it’s archived to tablespace on magnetic disc
  • Incoming records are added using a powershell script. The INSERT statement is wrapped in a stopwatch function which allows me to accurately record how long each INSERT takes and work out the time per record.
  • Incoming records are arriving one-by-one from an external server, over which I have no control, so using COPY wouldn't help.
  • Spooling the records to a temporary file prior to loading hasn't worked either since it seems to take longer to append to a file than it does to insert into a database.
  • It currently takes 1.5ms to insert a record, so the system can cope with a maximum of between 5 and 700 hundred records/sec. The problem is that the source can, at busy periods, generate 800-900 records per second.

UPDATE in response to horse_with_no_name. Is it number of processors, or number of cores that effect the number of simultaneous INSERTS that Postgres can cope with? My processor is (I quote) "Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, 2601 Mhz, 4 Core(s), 8 Logical Processor(s)"

Having 4 Core(s), does that mean I ought to be able support 4 simultaneous processes doing INSERTs at the same time? Could I re-write my powershell script to round-robin the incoming records across 2 (or more) connections and improve INSERT performance that way?

Best Answer