Postgresql – How to optimization database for heavy I/O from updates (software and hardware)

hardwareperformancepostgresqlupdate

The situation
I have a postgresql 9.2 database which is quite heavily updated all the time. The system is hence I/O bound, and I'm currently considering making another upgrade, I just need some directions on where to start improving.

Here is a picture of how the situation looked the past 3 months:

enter image description here

As you can see, update operations accounts for most of the disk utilization. Here is another picture of how the situation looks in a more detailed 3 hour window:

enter image description here

As you can see, the peak write rate is around 20MB/s

Software
The server is running ubuntu 12.04 and postgresql 9.2.
The type of updates are small updated typically on individual rows identified by ID. E.g. UPDATE cars SET price=some_price, updated_at = some_time_stamp WHERE id = some_id.
I have removed and optimized indexes as much as I think is possible, and the servers configuration (both linux kernel and postgres conf) is pretty optimized as well.

Hardware
The hardware is a dedicated server with 32GB ECC ram, 4x 600GB 15.000 rpm SAS disks in a RAID 10 array, controlled by an LSI raid controller with BBU and a Intel Xeon E3-1245 Quadcore processor.

Questions

  • Is the performance seen by the graphs reasonable for a system of this
    caliber (read/writes)?
  • Should I hence focus on doing a hardware upgrade or investigate deeper into the software (kernel tweaking, confs, queries etc.)?
  • If doing a hardware upgrade, is the number of disks key to performance?

——————————UPDATE———————————–

I have now upgraded my database server with four intel 520 SSDs instead of the old 15k SAS disks. I'm using the same raid controller. Things has improved quite a lot, as you can see from the following the peak I/O performance has improved around 6-10 times – and that's great!.
enter image description here
However, I was expecting something more like a 20-50 times improvement according to the answers and the I/O capabilities of the new SSDs. So here goes another question.

New question
Is there something in my current configuration, that is limiting the I/O performance of my system (where is the bottleneck)?

My configurations:

/etc/postgresql/9.2/main/postgresql.conf

data_directory = '/var/lib/postgresql/9.2/main'
hba_file = '/etc/postgresql/9.2/main/pg_hba.conf'
ident_file = '/etc/postgresql/9.2/main/pg_ident.conf'
external_pid_file = '/var/run/postgresql/9.2-main.pid'
listen_addresses = '192.168.0.4, localhost'
port = 5432
unix_socket_directory = '/var/run/postgresql'
wal_level = hot_standby
synchronous_commit = on
checkpoint_timeout = 10min
archive_mode = on
archive_command = 'rsync -a %p postgres@192.168.0.2:/var/lib/postgresql/9.2/wals/%f </dev/null'
max_wal_senders = 1
wal_keep_segments = 32
hot_standby = on
log_line_prefix = '%t '
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
default_statistics_target = 100
maintenance_work_mem = 1920MB
checkpoint_completion_target = 0.7
effective_cache_size = 22GB
work_mem = 160MB
wal_buffers = 16MB
checkpoint_segments = 32
shared_buffers = 7680MB
max_connections = 400 

/etc/sysctl.conf

# sysctl config
#net.ipv4.ip_forward=1
net.ipv4.conf.all.rp_filter=1
net.ipv4.icmp_echo_ignore_broadcasts=1
# ipv6 settings (no autoconfiguration)
net.ipv6.conf.default.autoconf=0
net.ipv6.conf.default.accept_dad=0
net.ipv6.conf.default.accept_ra=0
net.ipv6.conf.default.accept_ra_defrtr=0
net.ipv6.conf.default.accept_ra_rtr_pref=0
net.ipv6.conf.default.accept_ra_pinfo=0
net.ipv6.conf.default.accept_source_route=0
net.ipv6.conf.default.accept_redirects=0
net.ipv6.conf.default.forwarding=0
net.ipv6.conf.all.autoconf=0
net.ipv6.conf.all.accept_dad=0
net.ipv6.conf.all.accept_ra=0
net.ipv6.conf.all.accept_ra_defrtr=0
net.ipv6.conf.all.accept_ra_rtr_pref=0
net.ipv6.conf.all.accept_ra_pinfo=0
net.ipv6.conf.all.accept_source_route=0
net.ipv6.conf.all.accept_redirects=0
net.ipv6.conf.all.forwarding=0
# Updated according to postgresql tuning
vm.dirty_ratio = 10
vm.dirty_background_ratio = 1
vm.swappiness = 0
vm.overcommit_memory = 2
kernel.sched_autogroup_enabled = 0
kernel.sched_migration_cost = 50000000

/etc/sysctl.d/30-postgresql-shm.conf

# Shared memory settings for PostgreSQL
# Note that if another program uses shared memory as well, you will have to
# coordinate the size settings between the two.
# Maximum size of shared memory segment in bytes
#kernel.shmmax = 33554432
# Maximum total size of shared memory in pages (normally 4096 bytes)
#kernel.shmall = 2097152
kernel.shmmax = 8589934592
kernel.shmall = 17179869184
# Updated according to postgresql tuning

Output of MegaCli64 -LDInfo -LAll -aAll

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 446.125 GB
Sector Size         : 512
Is VD emulated      : No
Mirror Data         : 446.125 GB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives per span:2
Span Depth          : 2
Default Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Is VD Cached: No

Best Answer

If doing a hardware upgrade, is the number of disks key to performance?

Yes, as a hard disc - even SAS - has a head that takes time to move.

Want a HUGH upgrade?

Kill the SAS discs, go to SATA. Plug in SATA SSD - enterprise level, like the Samsung 843T.

Result? You can do around 60.000 (that is 60 thousand) IOPS per drive.

This is the reason SSD are killers in DB space and so much cheaper than any SAS drive. Phyiscal spinning disc just can not keep up with the IOPS capabilities of discs.

Your SAS discs were a mediocre choise to start with (too large to get a lot of IOPS) For a higher use database (more smaller discs would mean a lot more IOPS), but at the end SSD are the game changer here.

Regarding software / kernel. Any decent database will do a lot of IOPS and flush the buffers. THe log file needs to be WRITTEN for basic ACID conditions to be guaranteed. The only kernel tunes you could do would invalidate your transactional integrity - partially you CAN get away with that. The Raid controller in write back mode will do that - confirm the write as flushed even if it is not - but it can do so because the BBU is assumed to safe the day when power fails. Anything you do higher up in the kernel - better know you can live with the negative side effects.

At the end, databases need IOPS, and you may be surprised to see how tiny your setup is compared to some others here. I have seen databaes with 100+ discs just to get the IOPS they needed. But really, today, you buy SSD and go for size on them - they are so superior in IOPS capabilities, it makes no sense to fight this game with SAS drives.

And yes, your IOPS numbers do not look bad for the hardware. Within what I would expect.