Postgresql – How to optimization database for heavy I/O from updates (software and hardware)

hardwareperformancepostgresqlupdate

The situation
I have a postgresql 9.2 database which is quite heavily updated all the time. The system is hence I/O bound, and I'm currently considering making another upgrade, I just need some directions on where to start improving.

Here is a picture of how the situation looked the past 3 months:

enter image description here

As you can see, update operations accounts for most of the disk utilization. Here is another picture of how the situation looks in a more detailed 3 hour window:

enter image description here

As you can see, the peak write rate is around 20MB/s

Software
The server is running ubuntu 12.04 and postgresql 9.2.
The type of updates are small updated typically on individual rows identified by ID. E.g. UPDATE cars SET price=some_price, updated_at = some_time_stamp WHERE id = some_id.
I have removed and optimized indexes as much as I think is possible, and the servers configuration (both linux kernel and postgres conf) is pretty optimized as well.

Hardware
The hardware is a dedicated server with 32GB ECC ram, 4x 600GB 15.000 rpm SAS disks in a RAID 10 array, controlled by an LSI raid controller with BBU and a Intel Xeon E3-1245 Quadcore processor.

Questions

Is the performance seen by the graphs reasonable for a system of this
caliber (read/writes)?
Should I hence focus on doing a hardware upgrade or investigate deeper into the software (kernel tweaking, confs, queries etc.)?
If doing a hardware upgrade, is the number of disks key to performance?

——————————UPDATE———————————–

I have now upgraded my database server with four intel 520 SSDs instead of the old 15k SAS disks. I'm using the same raid controller. Things has improved quite a lot, as you can see from the following the peak I/O performance has improved around 6-10 times – and that's great!.
enter image description here
However, I was expecting something more like a 20-50 times improvement according to the answers and the I/O capabilities of the new SSDs. So here goes another question.

New question
Is there something in my current configuration, that is limiting the I/O performance of my system (where is the bottleneck)?

My configurations:

/etc/postgresql/9.2/main/postgresql.conf

data_directory = '/var/lib/postgresql/9.2/main'
hba_file = '/etc/postgresql/9.2/main/pg_hba.conf'
ident_file = '/etc/postgresql/9.2/main/pg_ident.conf'
external_pid_file = '/var/run/postgresql/9.2-main.pid'
listen_addresses = '192.168.0.4, localhost'
port = 5432
unix_socket_directory = '/var/run/postgresql'
wal_level = hot_standby
synchronous_commit = on
checkpoint_timeout = 10min
archive_mode = on
archive_command = 'rsync -a %p postgres@192.168.0.2:/var/lib/postgresql/9.2/wals/%f </dev/null'
max_wal_senders = 1
wal_keep_segments = 32
hot_standby = on
log_line_prefix = '%t '
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
default_statistics_target = 100
maintenance_work_mem = 1920MB
checkpoint_completion_target = 0.7
effective_cache_size = 22GB
work_mem = 160MB
wal_buffers = 16MB
checkpoint_segments = 32
shared_buffers = 7680MB
max_connections = 400

/etc/sysctl.conf

# sysctl config
#net.ipv4.ip_forward=1
net.ipv4.conf.all.rp_filter=1
net.ipv4.icmp_echo_ignore_broadcasts=1
# ipv6 settings (no autoconfiguration)
net.ipv6.conf.default.autoconf=0
net.ipv6.conf.default.accept_dad=0
net.ipv6.conf.default.accept_ra=0
net.ipv6.conf.default.accept_ra_defrtr=0
net.ipv6.conf.default.accept_ra_rtr_pref=0
net.ipv6.conf.default.accept_ra_pinfo=0
net.ipv6.conf.default.accept_source_route=0
net.ipv6.conf.default.accept_redirects=0
net.ipv6.conf.default.forwarding=0
net.ipv6.conf.all.autoconf=0
net.ipv6.conf.all.accept_dad=0
net.ipv6.conf.all.accept_ra=0
net.ipv6.conf.all.accept_ra_defrtr=0
net.ipv6.conf.all.accept_ra_rtr_pref=0
net.ipv6.conf.all.accept_ra_pinfo=0
net.ipv6.conf.all.accept_source_route=0
net.ipv6.conf.all.accept_redirects=0
net.ipv6.conf.all.forwarding=0
# Updated according to postgresql tuning
vm.dirty_ratio = 10
vm.dirty_background_ratio = 1
vm.swappiness = 0
vm.overcommit_memory = 2
kernel.sched_autogroup_enabled = 0
kernel.sched_migration_cost = 50000000

/etc/sysctl.d/30-postgresql-shm.conf

# Shared memory settings for PostgreSQL
# Note that if another program uses shared memory as well, you will have to
# coordinate the size settings between the two.
# Maximum size of shared memory segment in bytes
#kernel.shmmax = 33554432
# Maximum total size of shared memory in pages (normally 4096 bytes)
#kernel.shmall = 2097152
kernel.shmmax = 8589934592
kernel.shmall = 17179869184
# Updated according to postgresql tuning

Output of MegaCli64 -LDInfo -LAll -aAll

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 446.125 GB
Sector Size         : 512
Is VD emulated      : No
Mirror Data         : 446.125 GB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives per span:2
Span Depth          : 2
Default Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Is VD Cached: No

Best Answer

If doing a hardware upgrade, is the number of disks key to performance?

Yes, as a hard disc - even SAS - has a head that takes time to move.

Want a HUGH upgrade?

Kill the SAS discs, go to SATA. Plug in SATA SSD - enterprise level, like the Samsung 843T.

Result? You can do around 60.000 (that is 60 thousand) IOPS per drive.

This is the reason SSD are killers in DB space and so much cheaper than any SAS drive. Phyiscal spinning disc just can not keep up with the IOPS capabilities of discs.

Your SAS discs were a mediocre choise to start with (too large to get a lot of IOPS) For a higher use database (more smaller discs would mean a lot more IOPS), but at the end SSD are the game changer here.

Regarding software / kernel. Any decent database will do a lot of IOPS and flush the buffers. THe log file needs to be WRITTEN for basic ACID conditions to be guaranteed. The only kernel tunes you could do would invalidate your transactional integrity - partially you CAN get away with that. The Raid controller in write back mode will do that - confirm the write as flushed even if it is not - but it can do so because the BBU is assumed to safe the day when power fails. Anything you do higher up in the kernel - better know you can live with the negative side effects.

At the end, databases need IOPS, and you may be surprised to see how tiny your setup is compared to some others here. I have seen databaes with 100+ discs just to get the IOPS they needed. But really, today, you buy SSD and go for size on them - they are so superior in IOPS capabilities, it makes no sense to fight this game with SAS drives.

And yes, your IOPS numbers do not look bad for the hardware. Within what I would expect.

Related Solutions

Postgresql – Optimizing ORDER BY in a full text search query

What I still don't understand, is why this is slower.

That sorting the rows will cost something is obvious. But why so much?
Without ORDER BY rank0... Postgres can just

pick the first 5 rows it finds and stop fetching rows as soon as it has 5.

Bitmap Heap Scan on entities ... rows=5 ...
then compute ts_rank() for just 5 rows.

In the second case, Postgres has to

fetch all (1495 according to your query plan) rows that qualify.

Bitmap Heap Scan on entities ... rows=1495 ...
compute ts_rank() for all of them.
sort all of them to find the first 5 according to the calculated value.

Try ORDER BY name just to see the cost of computing to_tsquery('english', 'hockey'::text)) for the superfluous rows and how much remains for fetching more rows and sorting.

Postgresql – Can’t get Dell PE T420 (Perc H710) perform better than a MacMini with PostgreSQL

After asking on pgsql-performance list, Jeff Janes figured out that the cause was associated to the default collation used by Postgres (see this link for more informations). MacMini was using the much performing collation while Dell T420 was using the en/US collation.

T420 (Postgres 9.4.1)

                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 pen       | pen      | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

MacMini (Postgres 9.0.13)

                                    List of databases
       Name        |   Owner    | Encoding | Collation | Ctype |    Access privileges    
-------------------+------------+----------+-----------+-------+-------------------------
 caldav            | caldav     | UTF8     | C         | C     | 
 collab            | collab     | UTF8     | C         | C     | 
 device_management | _devicemgr | UTF8     | C         | C     | 
 pen               | pen        | UTF8     | C         | C     | 
 postgres          | _postgres  | UTF8     | C         | C     | 
 roundcubemail     | roundcube  | UTF8     | C         | C     | 
 template0         | _postgres  | UTF8     | C         | C     | =c/_postgres           +
                   |            |          |           |       | _postgres=CTc/_postgres
 template1         | _postgres  | UTF8     | C         | C     | =c/_postgres           +
                   |            |          |           |       | _postgres=CTc/_postgres
(8 rows)

After setting collation on T420 "C" the A transaction went from 195 seconds to 33 seconds against 40 seconds on Mac Mini; B type transaction went from 141 seconds to 78 seconds against 101 seconds on Mac Mini. This is the best performance improvement after modifing BIOS settings. Many kernel adjustments didn't provide significant improvements.

So, running the following command will initialize a new database with collation C and encoding UTF8:

/usr/local/pgsql/bin/initdb -D /path/to/your/data --no-locale --encoding=UTF8

I hope this post will help other people in the future. After reading about a lot of digressions about kernel, virtual memory, RAID controllers, disk cache, WAL and other tech stuff I never found someone talking about collations.

Best Answer

Related Solutions

Postgresql – Optimizing ORDER BY in a full text search query

Postgresql – Can’t get Dell PE T420 (Perc H710) perform better than a MacMini with PostgreSQL

Related Question