Why Performance is Non-Linear with Many UPDATEs in Single Transaction – PostgreSQL

performancepostgresqltransaction

An older question covers some reasons why the performance of multiple INSERTS within a single transaction is non-linear as the INSERT count grows.

Following some of the advice there, I've been trying to optimise running many UPDATEs in a single transaction. In the real scenario, we're batch processing data from another system, but I've got a smaller scenario for testing.

Given this table on postgresql 9.5.1:

\d+ foo
                                         Table "public.foo"
 Column |  Type   |                    Modifiers                     | Storage | Stats target | Description 
--------+---------+--------------------------------------------------+---------+--------------+-------------
 id     | bigint  | not null default nextval('foo_id_seq'::regclass) | plain   |              | 
 count  | integer | not null                                         | plain   |              |

I have the following test files: 100.sql, 1000.sql, 10000.sql, 50000.sql and 100000.sql. Each contains the following lines, with UPDATE repeated according to the filename:

BEGIN;
UPDATE foo SET count=count+1 WHERE id=1;
...
UPDATE foo SET count=count+1 WHERE id=1;
COMMIT;

When I benchmark loading each file, the results look like this:

              user     system      total        real   ms/update
100       0.000000   0.010000   0.040000 (  0.044277)  0.44277
1000      0.000000   0.000000   0.040000 (  0.097175)  0.09717
10000     0.020000   0.020000   0.230000 (  1.717170)  0.17171
50000     0.160000   0.130000   1.840000 ( 30.991350)  0.61982
100000    0.440000   0.380000   5.320000 (149.199524)  1.49199

The average time per UPDATE increases as the transaction includes more rows, suggesting that the performance is non-linear.

The earlier question I linked to suggests that indexes can be an issue, however this table has no indexes and a single row.

Is this just a case of "that's how it works", or are there some settings I can tune to improve the situation?

Update

Based on a theory in the current answer, I've run an additional test. The table structure is the same, but the UPDATEs all change different rows. The input files now look like this:

BEGIN;
UPDATE foo SET count=count+1 WHERE id=1;
UPDATE foo SET count=count+1 WHERE id=2;
...
UPDATE foo SET count=count+1 WHERE id=n;
COMMIT;

When I benchmark loading these files, the results look like this:

              user     system      total        real   ms/update
100       0.000000   0.000000   0.030000 (  0.044876)  0.44876
1000      0.010000   0.000000   0.050000 (  0.102998)  0.10299
10000     0.000000   0.040000   0.140000 (  0.666050)  0.06660
50000     0.070000   0.140000   0.550000 (  3.150734)  0.06301
100000    0.130000   0.280000   1.110000 (  6.458655)  0.06458

From 10,000 UPDATEs and up (once setup costs are amortised) the performance is linear.

Best Answer

(Note: I point out that this question is unpractical. So, it is completely inappropriate in order to evaluate the performance of PostgreSQL.)

I guess it is caused by the MVCC mechanism of PostgreSQL.

As well known, PostgreSQL's MVCC is implemented in over-writing mechanism. I'll show a concrete example using the pageinspector extention bundled in contrib subdirectory.

First, I start a transaction and do the first UPDATE statement:

BEGIN;

SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid FROM heap_page_items(get_raw_page('foo', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid 
-------+--------+--------+-------+--------
     1 |   2755 |      0 |     0 | (0,1)
(1 row)

UPDATE foo SET count=count+1 WHERE id=1;

SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid FROM heap_page_items(get_raw_page('foo', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid 
-------+--------+--------+-------+--------
     1 |   2755 |   2756 |     0 | (0,2)
     2 |   2756 |      0 |     0 | (0,2)
(2 rows)

When updating the data, PostgreSQL reads and updates the fields of the first tuple header (t_xmax and t_ctid), and then inserts the new (second) tuple.

Next, I do the second UPDATE statement:

UPDATE foo SET count=count+1 WHERE id=1;

SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid FROM heap_page_items(get_raw_page('foo', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid 
-------+--------+--------+-------+--------
     1 |   2755 |   2756 |     0 | (0,2)
     2 |   2756 |   2756 |     0 | (0,3)
     3 |   2756 |      0 |     1 | (0,3)
(3 rows)

After reading the first tuple, PostgreSQL reads the second tuple since the t_ctid field of the first tuple points to the second tuple (0,2). And then, PostgreSQL updates the fields of the second one and inserts the third one.

In this way, when many UPDATE statements are issued in a single transaction, PostgreSQL must read and update the header fields of old tuples whenever inserting a new tuple.

This is my hypothesis. An weakness of this hypothesis is that the processing time order is O(n^2), so this may be wrong (it seem that the result of that benchmark doesn't fit with O(n^2)).

In any case, doing many UPDATE statements in single transaction is not a good way because it makes many dead pages that contain only dead tuples, so you have to do VACUUM FULL (not VACUUM).

Related Solutions

Postgresql – Why do psql queries take longer when inserting many rows. Why is it non linear

Leaving aside the fact that the DISTINCT is causing some weird behavior, there are two main reasons why insert times get longer as bulk loads get larger:

B-tree indexes get less efficient to update as they get larger and have more tree levels. So indexes take longer to insert a the millionth value than they did the 10th.
At certain sizes, you exceed certain thresholds which cause extra IO on the system, resulting in lag while the IO takes place. These thresholds, which interact in complex ways, include:
- the size of the WAL, causing log rotation
- the size of the RAID cache, dropping to disk speeds
- the size of Postgres' dedicated cache, causing flushing to the FS
- the size of the FS cache's dirty block flushing threshold
- the size of the entire FS cache, causing emergency flushing

SQL Server Query Performance – Optimizing Group By with Many Columns

The non-clustered index you have tested is not the best for this query. It can be used for the WHERE clause and for doing an index scan instead of a full table scan but it cannot be used for the GROUP BY.

The best possible index would have to be a partial index (to filter the unwanted rows from the WHERE clause), then have all the columns used in the GROUP BY and then INCLUDE all the other columns used in the SELECT:

CREATE INDEX special_ix 
  ON dbo.Commissions_Output
    ( company, location, account, 
      salesroute, employee, producttype, 
      item, loadjdate, commissionrate ) 
INCLUDE 
  ( [Extended Sales Price], [Delivered Qty] ) 
WHERE 
  ( [Extended Sales Price] <> 0 ) ;

Update

Best Answer

Related Solutions

Postgresql – Why do psql queries take longer when inserting many rows. Why is it non linear

SQL Server Query Performance – Optimizing Group By with Many Columns

Related Question