I need to populate a foreign key column based on the contents of another table, and both methods I've tried are prohibitively slow.
I have the below tables (some columns omitted):
Table "public.interface_events"
Column | Type | Modifiers
----------------------------+---------+--------------------
device_id | integer | not null
interface_id | integer |
Foreign-key constraints:
"interface_events_device_id_fkey" FOREIGN KEY (device_id) REFERENCES device(id) ON UPDATE CASCADE ON DELETE RESTRICT
"interface_events_interface_id_fkey" FOREIGN KEY (interface_id) REFERENCES interface(id) ON UPDATE CASCADE ON DELETE RESTRICT
Table "public.device"
Column | Type | Modifiers
----------------------+-----------------------+-----------------------------------------------------
id | integer | not null default nextval('device_id_seq'::regclass)
interface_id | integer | not null
Indexes:
"device_pkey" PRIMARY KEY, btree (id)
"device_interface_id_idx" btree (interface_id)
Foreign-key constraints:
"device_interface_id_fkey" FOREIGN KEY (interface_id) REFERENCES interface(id) ON UPDATE CASCADE ON DELETE RESTRICT
Column interface_id in table interface_events needs updating from table device, so currently I have:
UPDATE interface_events SET interface_id = device.interface_id
FROM device
WHERE device.id = interface_events.device_id
I should point out that the relationship between device and interface_events is one to many, so there may be thousands of events per device. Is the UPDATE FROM method suitable in this situation? I'd earlier tried:
UPDATE interface_events SET interface_id = (SELECT device.interface_id FROM device WHERE device.id = interface_events.device_id);
But that took forever also.
There are 30 million rows in table interface_events, 32,000 in table device.
The update running currently (using the FROM syntax) has been active for 21 hours, although this is on a low spec test machine (2 x 3G cores, 3G RAM, spindle disk – it's disk I/O that's killing it, constant 99%).
It occures to me as I write I should have applied the foreign key constaint after the update, since I know it's integrity will be ok thanks to the column I'm copying from already being constrained, so I guess this question is more about the best query to use?
Best Answer
The fastest way will probably be to create a new table based on a join of the existing tables, create indexes and foreign keys constraints, and then drop the old "interface_events" and rename the new table into place. Views and foreign keys into this table can complicate this.
But the
interface_events.*
notation relies on interface_id not already existing in interface_events. Since it already exists, you would have to list out all columns in interface_events except for interface_id.