Your statement seems to be correct. It could also be written like this:
DELETE ii
FROM inventoryitems AS ii
WHERE ii.itemid = 2340000
AND EXISTS
( SELECT *
FROM characters AS c
WHERE c.characterid = ii.characterid
AND EXISTS
( SELECT *
FROM accounts AS a
WHERE a.id = c.accountid
AND a.banned = 1
)
) ;
One thing that may be causing this is if you have a character related to many accounts and one of them has banned = 1
while the other have banned = 0
. I assume you want the deletion to happen (not with just one but) only if all the related accounts have banned = 1
. We can modify the above code to:
DELETE ii
FROM inventoryitems AS ii
WHERE ii.itemid = 2340000
AND EXISTS
( SELECT *
FROM characters AS c
WHERE c.characterid = ii.characterid
AND EXISTS
( SELECT *
FROM accounts AS a
WHERE a.id = c.accountid
AND a.banned = 1
)
AND NOT EXISTS
( SELECT *
FROM accounts AS a
WHERE a.id = c.accountid
AND a.banned = 0
)
) ;
or simpler to:
DELETE ii
FROM inventoryitems AS ii
WHERE ii.itemid = 2340000
AND EXISTS
( SELECT *
FROM characters AS c
JOIN accounts AS a
ON a.id = c.accountid
WHERE c.characterid = ii.characterid
HAVING MIN(a.banned) = 1
) ;
After the clarifications, all the above are void. The problem was that characters
table does not have characterid
column but only id
. So the statement used by the OP was translated/parsed as:
DELETE FROM inventoryitems
WHERE characterid IN
(SELECT inventoryitems.characterid -- notice this
from characters
WHERE accountid IN
(SELECT id
from accounts
WHERE banned = '1'
)
)
AND itemid = '2340000';
which makes the subquery uncorrelated and means "delete all rows with itemid = 2340000
" if there exists at least one row (any row, not necessarily related) in the accounts with banned=1
"
That's one reason why it's good to always (*) write columns with their full name as tablename.columnname
or tablealias.columnname
(an error would have been thrown if you had done this and problem would have been solved faster.)
(*) Unless one wants this behaviour to occur, which is a rather extreme case.
In MySQL there is a multi-table DELETE syntax. Your first DELETE will delete rows only from the players
TABLE. If you want to delete from multiple tables you have to use something like:
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01"
This doesn't address the problem with the long-running DELETE statement though. In fact above query should take even more time, because now it actually would delete rows from stats
and photos
tables. The workaround you could use is to split the large DELETE into smaller ones. Since you have a nice WHERE condition, you could manually split the deletes on that (for example one DELETE for each ten years of players.born_on
) and run them in ascending order, that is:
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1930-01-01";
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1940-01-01";
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01";
It this is too coarse (i.e. it takes too long do execute each query) you should make the WHERE conditions even more fine-grained (perhaps delete one year each chunk).
Also there is a --purge
option for pt-archiver from Percona Toolkit which would split the data to be deleted in chunks automatically, but it doesn't seem to support the multi table case. See example usage of pt-archiver in this presentation
From your table definitions I see that you don't have an index on players.birth_date
(I suppose that this is the column you relate to as born_at
in your example queries). This makes the decade chunks approach useless, since every query would have to scan all players
table.
If you can't afford having a long table lock for the DELETE to finish, you most likely can't afford to create an index on the birth_date
column as well.
You could split the data on another column, PRIMARY KEY is a good bet. You can write a script which would process all the players in chunks of 10000 (or less or more, depending on the length of a single DELETE statement):
DELETE FROM "players","stats","photos"
USING "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01"
AND "players".id BETWEEN n*10000+1 AND (n+1)*10000
Where n would be a parameter ranging from 0 to MAX(players.id)/10000 . This way you will avoid a full table scan (which certainly is painful for a 100M table)
You could also try to estimate the DELETE complexity with an EXPLAIN SELECT instead of DELETE:
EXPLAIN SELECT *
FROM "players"
LEFT JOIN "stats"
ON "players".id = "stats".player_id
LEFT JOIN "photos"
ON "players".id = "photos".player_id
WHERE "players".born_on < "1950-01-01"
AND "players".id BETWEEN 1 AND 10000
Best Answer
Yes, you can use
IN()
in theWHERE
clause of aDELETE
statement.There is no "
IN
clause" --IN()
is just a comparison operator, perhaps slightly disguised as something more sophisticated.http://dev.mysql.com/doc/refman/5.6/en/comparison-operators.html#function_in
To illustrate what
IN()
is really doing, consider these examples in isolation....and, as @ypercube has pointed out, the "
where_condition
is an expression that evaluates to true for each row to be deleted".