Why Performance is Non-Linear with Many UPDATEs in Single Transaction – PostgreSQL

performancepostgresqltransaction

An older question covers some reasons why the performance of multiple INSERTS within a single transaction is non-linear as the INSERT count grows.

Following some of the advice there, I've been trying to optimise running many UPDATEs in a single transaction. In the real scenario, we're batch processing data from another system, but I've got a smaller scenario for testing.

Given this table on postgresql 9.5.1:

\d+ foo
                                         Table "public.foo"
 Column |  Type   |                    Modifiers                     | Storage | Stats target | Description 
--------+---------+--------------------------------------------------+---------+--------------+-------------
 id     | bigint  | not null default nextval('foo_id_seq'::regclass) | plain   |              | 
 count  | integer | not null                                         | plain   |              | 

I have the following test files: 100.sql, 1000.sql, 10000.sql, 50000.sql and 100000.sql. Each contains the following lines, with UPDATE repeated according to the filename:

BEGIN;
UPDATE foo SET count=count+1 WHERE id=1;
...
UPDATE foo SET count=count+1 WHERE id=1;
COMMIT;  

When I benchmark loading each file, the results look like this:

              user     system      total        real   ms/update
100       0.000000   0.010000   0.040000 (  0.044277)  0.44277
1000      0.000000   0.000000   0.040000 (  0.097175)  0.09717
10000     0.020000   0.020000   0.230000 (  1.717170)  0.17171
50000     0.160000   0.130000   1.840000 ( 30.991350)  0.61982
100000    0.440000   0.380000   5.320000 (149.199524)  1.49199

The average time per UPDATE increases as the transaction includes more rows, suggesting that the performance is non-linear.

The earlier question I linked to suggests that indexes can be an issue, however this table has no indexes and a single row.

Is this just a case of "that's how it works", or are there some settings I can tune to improve the situation?

Update

Based on a theory in the current answer, I've run an additional test. The table structure is the same, but the UPDATEs all change different rows. The input files now look like this:

BEGIN;
UPDATE foo SET count=count+1 WHERE id=1;
UPDATE foo SET count=count+1 WHERE id=2;
...
UPDATE foo SET count=count+1 WHERE id=n;
COMMIT; 

When I benchmark loading these files, the results look like this:

              user     system      total        real   ms/update
100       0.000000   0.000000   0.030000 (  0.044876)  0.44876
1000      0.010000   0.000000   0.050000 (  0.102998)  0.10299
10000     0.000000   0.040000   0.140000 (  0.666050)  0.06660
50000     0.070000   0.140000   0.550000 (  3.150734)  0.06301
100000    0.130000   0.280000   1.110000 (  6.458655)  0.06458

From 10,000 UPDATEs and up (once setup costs are amortised) the performance is linear.

Best Answer

(Note: I point out that this question is unpractical. So, it is completely inappropriate in order to evaluate the performance of PostgreSQL.)

I guess it is caused by the MVCC mechanism of PostgreSQL.

As well known, PostgreSQL's MVCC is implemented in over-writing mechanism. I'll show a concrete example using the pageinspector extention bundled in contrib subdirectory.

First, I start a transaction and do the first UPDATE statement:

BEGIN;

SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid FROM heap_page_items(get_raw_page('foo', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid 
-------+--------+--------+-------+--------
     1 |   2755 |      0 |     0 | (0,1)
(1 row)

UPDATE foo SET count=count+1 WHERE id=1;

SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid FROM heap_page_items(get_raw_page('foo', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid 
-------+--------+--------+-------+--------
     1 |   2755 |   2756 |     0 | (0,2)
     2 |   2756 |      0 |     0 | (0,2)
(2 rows)

When updating the data, PostgreSQL reads and updates the fields of the first tuple header (t_xmax and t_ctid), and then inserts the new (second) tuple.

Next, I do the second UPDATE statement:

UPDATE foo SET count=count+1 WHERE id=1;

SELECT lp as tuple, t_xmin, t_xmax, t_field3 as t_cid, t_ctid FROM heap_page_items(get_raw_page('foo', 0));
 tuple | t_xmin | t_xmax | t_cid | t_ctid 
-------+--------+--------+-------+--------
     1 |   2755 |   2756 |     0 | (0,2)
     2 |   2756 |   2756 |     0 | (0,3)
     3 |   2756 |      0 |     1 | (0,3)
(3 rows)

After reading the first tuple, PostgreSQL reads the second tuple since the t_ctid field of the first tuple points to the second tuple (0,2). And then, PostgreSQL updates the fields of the second one and inserts the third one.

In this way, when many UPDATE statements are issued in a single transaction, PostgreSQL must read and update the header fields of old tuples whenever inserting a new tuple.

This is my hypothesis. An weakness of this hypothesis is that the processing time order is O(n^2), so this may be wrong (it seem that the result of that benchmark doesn't fit with O(n^2)).

In any case, doing many UPDATE statements in single transaction is not a good way because it makes many dead pages that contain only dead tuples, so you have to do VACUUM FULL (not VACUUM).