Postgresql – Update table with values from another table runs really slow

join;postgresqlupdate

I have a PostgreSQL table (tab_A) with about 32 million records, and I have a second PostgreSQL table (tab_B) with about 4000 records. tab_B contains 3 fields whose values I intend to pass on to tab_A (field1, field2, field3).

tab_A

  • 32 million records

  • Unique identifier (id)

  • BTREE index

tab_B

  • 4000 records

  • Unique identifier (id)

  • BTREE index

I am trying to do the job with the following query (previous to do this I have already created the 3 new fields in tab_A to host the values):

UPDATE tab_A
SET field1 = t2.field1, field2 = t2.field2, field3 = t2.field3
FROM tab_A t1 JOIN tab_B t2
ON t1.uprn = t2.uprn;

This query runs on and on for more than 5 hours and I eventually need to stop it because it doesn't seem to me it should be taking that long (my understanding is that if both tables contain an index this should be pretty fast).

Any ideas on whether I am missing something here? Perhaps it's normal it takes that long taking into account tab_A contains 32 million records? Any other approach to run this more efficiently?

Best Answer

Don't repeat the target table in the FROM clause:

UPDATE tab_A
   SET field1 = t2.field1, field2 = t2.field2, field3 = t2.field3
FROM tab_B t2
WHERE tab_A.uprn = t2.uprn;

Quote from the manual

Note that the target table must not appear in the from_list, unless you intend a self-join

(emphasis mine)