Postgresql – Many queries are slower on new hardware and newest version of PostgreSQL

optimizationperformancepostgresqlpostgresql-9.6query-performance

Note: I have significantly re-worked this question based on the direction the commentators have taken me and what I have learned on the way.

I am setting up a new development database server. The new one is a 6th Gen i7-6700 with 16GB DDR4 memory and an Intel 750 SSD with PostgreSQL 9.6. The old one is a 4th gen i5-4570 with 16GB DDR3 memory and a Crucial SSD Raid 0 array with PostgreSQL 9.2. I benchmarked the 750 and it is much faster than the RAID. However, the read performance below 256KB is slower.

I used pg_dump and pg_restore to move the data and I ran vacuum analyze on all databases.

The new computer should be quite a bit faster than the old one and in some types of queries is it, but in others it is several times slower. I also installed 9.2 on the new computer to try and isolate the database versions from the hardware. I kept the postgresql.conf identical where possible (9.6 changes some wal settings.)

I set up pg_stat_statements to get some statistics to narrow down the problem. I have 500+ schemas and 3000+ tables. The transformations I am running are creating new tables from some of these tables and updating those new tables from these 3000 other tables. I am bouncing all over this 50GB of data.

On the new server and 9.6 create table, copy, vacuum, analyze, drop table, updates that affect many rows, create index, create table as are usually faster, up to twice as fast. But occasionally they will be several times slower. The new server running 9.2 sees a similar pattern, but the performance increases are not nearly as pronounced. Interestingly, when 9.6 is slower on some particular copy or vacuum, 9.2 will usually be similarly slow.

drop index, drop table, select queries that return few rows, select count(*), update queries that affect a single row are all usually 50%-300% slower in the new 9.6 server. 9.2 on the new server is also slower, but usually only 20%-30%. I do not know how PostgreSQL accesses the disk, but I suspect that the Intel 750's slower read performance on smaller files is having an effect as seen in the slower 9.2 speed. But that does not explain the 5x-10x performance degradation between 9.2 and 9.6 on the same server.

So my conclusion is 9.6 is significantly slower for certain types of queries.

I have run explain on many of the queries in question and the plans are identical.

Here are some representative examples (from pg_stat_statements) of the queries that are slow. Note that they are all primary keyed, indexed vacuumed and analyzed and otherwise identical between the two databases. Also, I only selected queries that have at least 50 calls.

Update finalal Set jsondone=?, json=? Where id=? //20% slower
SELECT * FROM archive_natrilis_2015_q3.natrilis_releases WHERE siteid = ? ORDER BY reporting_year ASC LIMIT ? // 404% slower
SELECT * FROM work_narcra00_2016_q3.narcra_handler WHERE siteid = ? ORDER BY handler_sequence_number ASC LIMIT ? // 182% slower
select * from listtypedesc // small 20 row table, 173% slower.
select * from INFORMATION_SCHEMA.schemata where schema_name not like ? and schema_name not like ? AND schema_name like ? Order by schema_name Desc // 52% slower
Select count(*) From work_allustis_2016_q3.allustis_details Where siteid = ? // 234% slower
SELECT * FROM archive_naerns14_2015_q2.naerns14_material_involved WHERE siteid = ?  LIMIT ? // 287% slower

Is my conclusion correct? And if so is there a 9.6 specific way to optimize these slow queries? If not, are there other debug actions to recommend?

Best Answer

You have changed too many things at once. First run 9.2 on your new hardware and check the performance. Then upgrade to 9.6 once you get that sorted out.

How did you move the data? Logically, by using pg_dump and then a restore, or physically by using pg_basebackup (or a cold copy) and then pg_upgrade?

Update finalal Set jsondone=1 Where id=1

The old server runs this at an average of 0.2 ms over 72 iterations. The new server is 1.43ms.

72 iterations of the same id=? value, or consecutive ones, or random ones? And all in one transaction, or each a separate one?

The new server is showing half the cost (but more startup costs for some reason) so I assume it would be faster, especially on faster hardware.

Those costs are not general time estimates, they are internal accounting used to make reasonable planner choices. Things for which there are no choices (there is only way to update a row) are not costed at all, but obviously they take actual time to do. And even with that in mind, you have to use great caution comparing them across versions. They just aren't meant for that purpose.

I did notice that shared_blks_read and shared_blks_dirtied were about twice as high on the new server. shared_blks_hit were only slightly higher. Maybe a clue in that , but I do not really know that this means.

I think what this means is that your data in the new server is packed more tightly than it was in the old server, probably because you used pg_dump to get the data over rather than a physical copy. The default fillfactor for tables is 100%, so the rows are packed into a page as many as will fit.

So on the old server when you do an update, there is room to put the new version of the updated row in the same page as the old version. On the new server, there is no room on the page, so it has to seek out a different page to put the new version of the row on (and then both pages get dirtied). Which also means it has to update the indexes, so they know where to find the new version.

Related Solutions

Postgresql – Do I risk losing the benefits of indexing if I have an index on every column

The answer is (almost always) no.

Each index you create means more disk space used, longer maintenance windows, and a higher cost if the table which is never supposed to change (save for being replaced once a year) actually does get changed. They increase the size of your backups and make fitting into your disaster recovery window that much harder. Also, many of those indexes probably will never get used, so they're just dead weight.

Then, you have to think about multi-column indexes. In both of your examples, you have two columns. If you only index single columns, no single non-clustered index may be good enough and so the query would just go back to the clustered index. Then there are the three-column indexes, four-column indexes, etc. etc.

The better bet is to determine which queries are common and start indexing those. I'm not that familiar with postgresql internals, but there are some links floating around which can help set up statistics gathering. Once you get a fairly solid idea of which queries users are actually using, you can create indexes which can help. Even then, this isn't a perfect strategy: you're (metaphorically) building roads where people are walking, but the people creating those paths have to slog through it until you can find out where they're going and build the road to help. This is a reactionary process, but it probably would catch a fairly large percentage of common uses.

Because this is a read-only database, however, could reasonably safely create subsets of common data which fit a large percentage of user requests (either through separate tables or creating materialized views). You can also get away with more non-clustered indexes than you would want to for a typical OLTP table. There are certainly marginal costs to indexes, but they are lower when you don't need to worry too much about insertions, updates, and deletions. And if a large number of queries are built around date or some other similar field, you could possibly partition the table on that column.

MySQL – Fix Slow Queries After OS Reboot

When you reboot the OS you remove all of the disk reads that have previously put into operating system disk cache (RAM). Once you've rebooted, the operating system will have to read the MySQL data from disk, which is several orders of magnitude slower than reading from cache (RAM).

Optimise "fixes" this as it causes MySQL to read all of the table data from disk, enabling the OS to cache the data.

Linux uses most free RAM as disk cache. You can see this using the free command:

[oracle@ora12c1 ~]$ free
             total       used       free     shared    buffers     cached
Mem:       4050844     840072    3210772          0      97268     349716
-/+ buffers/cache:     393088    3657756
Swap:      4063228          0    4063228
[oracle@ora12c1 ~]$

Restarting MySQL has no affect on this cache, as it is the operating system that has cached the data, not MySQL.

To "fix" your problem you'll have to force the OS to read the data from disk so that it will cache it. You can use the Unix dd tool to do this. For example:

dd if=/var/mysql/data/mytable.idb of=/dev/null

Tip: you can verify how much of a given file is in the disk cache with the vmtouch command. That command can also purge the file out of the filesystem cache, or force loading the file into the filesystem cache. See http://hoytech.com/vmtouch/

Best Answer

Related Solutions

Postgresql – Do I risk losing the benefits of indexing if I have an index on every column

MySQL – Fix Slow Queries After OS Reboot

Related Question