Postgresql – Query between tables to delete similar records from another table

deletepostgresqlsubstring

I am writing an SQL query on two tables products01 and products02 in a PostgreSQL database.

For the entire set of records in products02 take the substring (first 10 characters of volumeId and use those substrings to find all the corresponding records in products01. There can be more than one hit in products01, for instance finding older legacy record of the same item, just a different edition. The eleventh character of volumeId makes it unique.

For instance, in products01 I can have volumeId's like:

record45671
record45672
record45673

And in products02 I would be using substring(volumeId for 10) or 'record456' to find those similar records in products01.

Also I'd like to display the results so I can verify my query is accurate.

Here is my initial attempt:

    SELECT substring(volumeId, for 10) FROM temporary.products01 t01
    WHERE EXISTS (
        SELECT volumeId FROM temporary.products02 t02
        WHERE t01.volumeId LIKE t02.volumeId);

Best Answer

To find all rows from table products01 where the 10 leading characters of volumeId match any products02.volumeId:

SELECT t1.*
FROM   products01 t1
JOIN   products02 t2 ON left(t1.volumeId, 10) = t2.volumeId;

To delete:

DELETE FROM products01 t1
USING  products02 t2
WHERE  left(t1.volumeId, 10) = t2.volumeId;

You can use this equivalent syntax to get both commands more in line:

SELECT DISTINCT t1.* FROM products01 t1, -- see below
-- DELETE FROM products01 t1 USING  
products02 t2
WHERE  left(t1.volumeId, 10) = t2.volumeId;

Only add DISTINCT if t2.volumeId is not fully deterministic (t1.volumeId can match multiple t2.volumeId).

Now you can either select the whole statement to select or starting with DELETE to delete. See: