I'm trying to find if there would be anything violating the foreign key constraint when deleting a row, to check if I can delete it or not (can't use ON DELETE CASCADE
)
For that, I have a piece of code that will give me a list of all (table, column) referencing that row's id. Then all I want to know is if any of those (table, column) have my row's id there, which would prevent its deletion.
I found 3 ways to do that, and all producing similar results on EXPLAIN ANALYZE on a very large database. I'll paste here some real examples:
The first one is to make a UNION of each (table, column) and check if the id is there (if it's null at the end, there's no references):
(SELECT client_category_price.sellable_id
FROM client_category_price
WHERE client_category_price.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT commission_source.sellable_id
FROM commission_source
WHERE commission_source.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT loan_item.sellable_id
FROM loan_item
WHERE loan_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT product.sellable_id
FROM product
WHERE product.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT product_history.sellable_id
FROM product_history
WHERE product_history.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT purchase_item.sellable_id
FROM purchase_item
WHERE purchase_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT receiving_order_item.sellable_id
FROM receiving_order_item
WHERE receiving_order_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT returned_sale_item.sellable_id
FROM returned_sale_item
WHERE returned_sale_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT sale_item.sellable_id
FROM sale_item
WHERE sale_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT service.sellable_id
FROM service
WHERE service.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT stock_decrease_item.sellable_id
FROM stock_decrease_item
WHERE stock_decrease_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT transfer_order_item.sellable_id
FROM transfer_order_item
WHERE transfer_order_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT work_order.sellable_id
FROM work_order
WHERE work_order.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
UNION
(SELECT work_order_item.sellable_id
FROM work_order_item
WHERE work_order_item.sellable_id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1)
The second one is to make a cross join and return a "big tuple" containing those same ids, and then test if all of them are null:
SELECT client_category_price.sellable_id, commission_source.sellable_id, loan_item.sellable_id, product.sellable_id, product_history.sellable_id, purchase_item.sellable_id,
receiving_order_item.sellable_id, returned_sale_item.sellable_id, sale_item.sellable_id, service.sellable_id, stock_decrease_item.sellable_id, transfer_order_item.sellable_id,
work_order.sellable_id, work_order_item.sellable_id
FROM sellable
LEFT JOIN client_category_price ON client_category_price.sellable_id = sellable.id
LEFT JOIN commission_source ON commission_source.sellable_id = sellable.id
LEFT JOIN loan_item ON loan_item.sellable_id = sellable.id
LEFT JOIN product ON product.sellable_id = sellable.id
LEFT JOIN product_history ON product_history.sellable_id = sellable.id
LEFT JOIN purchase_item ON purchase_item.sellable_id = sellable.id
LEFT JOIN receiving_order_item ON receiving_order_item.sellable_id = sellable.id
LEFT JOIN returned_sale_item ON returned_sale_item.sellable_id = sellable.id
LEFT JOIN sale_item ON sale_item.sellable_id = sellable.id
LEFT JOIN service ON service.sellable_id = sellable.id
LEFT JOIN stock_decrease_item ON stock_decrease_item.sellable_id = sellable.id
LEFT JOIN transfer_order_item ON transfer_order_item.sellable_id = sellable.id
LEFT JOIN work_order ON work_order.sellable_id = sellable.id
LEFT JOIN work_order_item ON work_order_item.sellable_id = sellable.id
WHERE sellable.id = '9bc202ca-f7c1-11e2-a751-062b1fc90460' LIMIT 1
Note that the database is so big that if I don't put the LIMIT 1
at the end, the query would take a very long time to execute, but with the limit, it take as long as the UNION above.
And there's the third option: I'm actually executing this from python. So I could execute a COUNT on each of those UNION queries and if one of them returns something > 0, I already know there exists a reference for the row. On the best case I would execute one COUNT and at the worst case all of those COUNTS.
Since all those options produce very insignificant differences (both take +/- 0.300ms with about 1 million rows divided between those related tables), which one would be better to use? I mean, which would scale better?
Best Answer
I suggest your first option, with two improvements and some simplifications.
Given that all you want to know is
You don't need a full list of violating rows. Stop searching at the first one. All you need to do is add another
LIMIT 1
at the end of the query. This way, Postgres skips rest of the query as soon as the first row is found. You probably don't needLIMIT 1
for eachSELECT
, just the one at the end. Test without, it may produce different query plans.Use
UNION ALL
instead ofUNION
. Faster.Some other simplifications.
Related answer on SO: