I have a PostgreSQL table (tab_A) with about 32 million records, and I have a second PostgreSQL table (tab_B) with about 4000 records. tab_B contains 3 fields whose values I intend to pass on to tab_A (field1, field2, field3).
tab_A
-
32 million records
-
Unique identifier (id)
-
BTREE index
tab_B
-
4000 records
-
Unique identifier (id)
-
BTREE index
I am trying to do the job with the following query (previous to do this I have already created the 3 new fields in tab_A to host the values):
UPDATE tab_A
SET field1 = t2.field1, field2 = t2.field2, field3 = t2.field3
FROM tab_A t1 JOIN tab_B t2
ON t1.uprn = t2.uprn;
This query runs on and on for more than 5 hours and I eventually need to stop it because it doesn't seem to me it should be taking that long (my understanding is that if both tables contain an index this should be pretty fast).
Any ideas on whether I am missing something here? Perhaps it's normal it takes that long taking into account tab_A contains 32 million records? Any other approach to run this more efficiently?
Best Answer
Don't repeat the target table in the FROM clause:
Quote from the manual
(emphasis mine)