Postgresql – Could someone explain bizarre behavior executing millions of UPDATES

postgresql

Could someone explain this behavior to me? I ran the following query on Postgres 9.3 running natively on OS X. I was trying to simulate some behavior where the index size could grow much larger than the table size, and instead found something even more bizarre.

CREATE TABLE test(id int);
CREATE INDEX test_idx ON test(id);

CREATE FUNCTION test_index(batch_size integer, total_batches integer) RETURNS void AS $$
DECLARE
  current_id integer := 1;
BEGIN
FOR i IN 1..total_batches LOOP
  INSERT INTO test VALUES (current_id);
  FOR j IN 1..batch_size LOOP
    UPDATE test SET id = current_id + 1 WHERE id = current_id;
    current_id := current_id + 1;
  END LOOP;
END LOOP;
END;
$$ LANGUAGE plpgsql;

SELECT test_index(500, 10000);

I let this run for about an hour on my local machine, before I started getting disk issue warnings from OS X. I noticed that Postgres was sucking up about 10MB/s from my local disk, and that the Postgres database was consuming a grand total of 30GB from my machine. I ended up cancelling the query. Regardless, Postgres did not return the disk space to me and I queried the database for usage statistics with the following result:

test=# SELECT nspname || '.' || relname AS "relation",
    pg_size_pretty(pg_relation_size(C.oid)) AS "size"
  FROM pg_class C
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema')
  ORDER BY pg_relation_size(C.oid) DESC
  LIMIT 20;

           relation            |    size
-------------------------------+------------
 public.test                   | 17 GB
 public.test_idx               | 14 GB

However, selecting from the table yielded no results.

test=# select * from test limit 1;
 id
----
(0 rows)

Running 10000 batches of 500 is 5,000,000 rows, which should yield a pretty small table/index size (on the scale of MB). I suspect that Postgres is creating a new version of the table/index for each INSERT/UPDATE that's happening with function, but this seems strange. The entire function is run transactionally, and the table was empty to start.

Any thoughts on why I'm seeing this behavior?

Specifically, the two questions I have are: why has this space not yet been reclaimed by the database and the second is why did the database require this much space in the first place? 30GB seems like a lot even when accounting for MVCC

Best Answer

Short version

Your algorithm looks O(n*m) on first glance, but effectively grows O(n * m^2), because all rows have the same ID. Instead of 5M rows, you are getting >1.25G rows

Long version

Your function is inside an implicit transaction. That's why you see no data after cancelling your query, and also why it needs to maintain distinct versions of the updated/inserted tuples for both loops.

Additionally, I suspect you have a bug in your logic or underestimating the number of updates made.

First iteration of the outer loop - current_id starts at 1, inserts 1 row, then the inner loop performs an update 10000 times for the same row, finalizing with the only row showing an ID of 10001, and current_id with a value of 10001. 10001 versions of the row are still kept, as the transaction is not finished.

Second iteration of the outer loop - as current_id is 10001, a new row is inserted with ID 10001. Now you have 2 rows with the same "ID", and 10003 versions in total of both rows (10002 of the first one, 1 of the second one). Then the inner loop updates 10000 times BOTH rows, creating 20000 new versions, getting to 30003 tuples so far...

Third iteration of the outer loop: current ID is 20001, a new row is inserted with ID 20001. You have 3 rows, all with same "ID" 20001, 30006 row/tuples versions so far. Then you perform 10000 updates of 3 rows, creating 30000 new versions, now 60006...

...

(If your space had allowed) - 500th iteration of the outer loop, creates 5M updates of 500 rows, just in this iteration

As you see, instead of your expected 5M updates, you got 1000 + 2000 + 3000 + ... + 4990000 + 5000000 updates (plus change), which would be 10000 * (1+2+3+...+499+500), over 1.25G updates. And of course a row is not just the size of your int, it needs some additional structure, so your table and index gets over ten gigabytes size.

Related Q & A: