The question is a unclear on how the tables are to actually related to each other.
Since the query you've posted is not working, it's difficult to tell which element of the logic contains the error, and with a column called "oldID" it's hard to understand the necessary join logic.
I have attempted to interpret and rewrite your query based on trying to understand it from the innermost query to the outermost query, it seems like this might be what you are trying to accomplish.
My table "ta" is doing the same thing as your table "ta" seems to be trying to do, and my table "a" is doing the same thing your table "a" seems to be trying to do. They are reversed in their order of appearance in the query because, as I mentioned, if I'm understanding your query correctly, the original query seems to be written inside-out.
UPDATE Taxonomy t
JOIN TaxonomyMapping ta ON ta.oldID = t.oldID
JOIN CategoryMapping cm ON cm.language_ID = ta.oldID
JOIN TaxonomyMapping a ON a.oldID = cm.en_id
SET t.sourceId = a.id
WHERE t.id > -1;
However, if your "TaxonomyMapping" and "Taxonomy" tables are an "exact copy" (containing the same data) then table "ta" is not necessary, and all you actually need is this:
UPDATE Taxonomy t
JOIN CategoryMapping cm ON cm.language_ID = t.oldID
JOIN TaxonomyMapping a ON a.oldID = cm.en_id
SET t.sourceId = a.id
WHERE t.id > -1;
In fact, it seems like you could even do it without a permanent copy.
UPDATE Taxonomy t
JOIN CategoryMapping cm ON cm.language_ID = t.oldID
JOIN (SELECT * FROM Taxonomy) x ON x.oldID = cm.en_ID
SET t.sourceId = x.id
WHERE t.id > -1;
That last example works because the server materializes a temporary copy of "Taxonomy" called table "x" before executing the rest of the query. By doing that, it's able to update the table with data selected from elsewhere in the same table, which you normally can't do... but since a copy is technically a different table, it works. The temporary copy is usually created in memory and then moved to disk if it turns out to be large, but either way, it's automatically deleted as soon as the query is finished. The down-side is that the new table won't have any indexes so this approach won't infinitely scale... but should still be faster than your original correlated subquery.
Of course, none of these will work if I have failed to understand the necessary logic.
OPTIMIZE TABLE
is essentially never needed for InnoDB.
New with 5.6: ALTER TABLE
uses ALGORITHM=INPLACE
for many actions (add/drop column/index, etc.). I would expect 5.6 to be faster.
What is that list of commands trying to do? Something to do with Normalization? Seems like it could be done in a couple of queries, without needing the extra table.
innodb_buffer_pool_size
should be set to about 70% of available RAM. That will improve performance on any version.
Don't index flags (deleted
), it will slow down INSERTs
, without speeding up SELECTs
.
Don't use 4-byte integers (INT
) for flags; use the 1-byte TINYINT
.
No currency in the world needs 10 decimal places: price decimal(23,10)
. And I doubt if room for 10 trillion 'dollars' is needed.
What's the value of autocommit during the UPDATEs
?
Best Answer
The Query cache is a hindrance. (In general, it should be OFF for production systems.) All entries in the QC for that table will be purged by the Update.
Millions of rows -- all of the table? A small fraction of a billion row table, but no good index? What?
Let's dissect the query...
Scan through the entire table. This takes time for the following reasons:
ROLLBACK
or crash)The size of "undo" may necessitate a less efficient method, adding to the overall time.
The affected rows will be locked; well you may as well plan on the 'entire' table being blocked from other usage.
Batching...
Why do you need to do such a big Update? Perhaps the column you are changing should be in a separate, parallel, table to avoid this much impact? Or 'computed' by subsequent
SELECTs
?Config changes? Assuming the innodb_buffer_pool_is big enough, nothing useful to do.
For chunking, see advice on chunking -- that talks about deleting, but can be adapted to updating.