After your query runs, you'll have a set of copied rows (with configid=41
) and an identical set of pasted rows (except for the configid=76
and the auto-created id
).
Since, these ids are not known in advance, you'll need another way to identify rows of the config table, e.g. a unique key (besides the auto-incrementing one), so you can match (join) the newly created rows with the old ones.
If, for example, the (configid, optionname)
is unique, then the following would work:
INSERT INTO pricing
( relid, price, ... ) --- relid and all the other columns,
--- except any autoincrement you may have
SELECT pasted.id, p.price, .... --- and the same columns here
FROM
pricing AS p
JOIN
tblproductconfigoptionssub AS copied
ON copied.id = p.relid
AND copied.configid = 41
JOIN
tblproductconfigoptionssub AS pasted
ON pasted.optioname = copied.optioname
AND pasted.configid = 76 ;
In MySQL there is a multi-table DELETE syntax. Your first DELETE will delete rows only from the players
TABLE. If you want to delete from multiple tables you have to use something like:
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01"
This doesn't address the problem with the long-running DELETE statement though. In fact above query should take even more time, because now it actually would delete rows from stats
and photos
tables. The workaround you could use is to split the large DELETE into smaller ones. Since you have a nice WHERE condition, you could manually split the deletes on that (for example one DELETE for each ten years of players.born_on
) and run them in ascending order, that is:
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1930-01-01";
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1940-01-01";
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01";
It this is too coarse (i.e. it takes too long do execute each query) you should make the WHERE conditions even more fine-grained (perhaps delete one year each chunk).
Also there is a --purge
option for pt-archiver from Percona Toolkit which would split the data to be deleted in chunks automatically, but it doesn't seem to support the multi table case. See example usage of pt-archiver in this presentation
From your table definitions I see that you don't have an index on players.birth_date
(I suppose that this is the column you relate to as born_at
in your example queries). This makes the decade chunks approach useless, since every query would have to scan all players
table.
If you can't afford having a long table lock for the DELETE to finish, you most likely can't afford to create an index on the birth_date
column as well.
You could split the data on another column, PRIMARY KEY is a good bet. You can write a script which would process all the players in chunks of 10000 (or less or more, depending on the length of a single DELETE statement):
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01"
AND "players".id BETWEEN n*10000+1 AND (n+1)*10000
Where n would be a parameter ranging from 0 to MAX(players.id)/10000 . This way you will avoid a full table scan (which certainly is painful for a 100M table)
You could also try to estimate the DELETE complexity with an EXPLAIN SELECT instead of DELETE:
EXPLAIN SELECT *
FROM "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01"
AND "players".id BETWEEN 1 AND 10000
Best Answer
It very much depends on the details of your database.
Factors include:
If I needed to delete most of the records in a table, I have definitely had times where it was faster to copy the records to another table,
TRUNCATE
the first table, and then copy the records to be saved back to the first table (with all indexes, constraints, etc.; the other table would usually be stripped down to just the raw data).That said, if you're keeping notably more records than you're deleting, and if you don't have a lot of indexes on the original table, it's entirely possible that deleting would be faster.
NOTE: If you don't need to keep all columns, then the
INSERT
method is almost certainly your best bet.