There is no direct way to tell MySQL to work on a single query "in chunks," since SQL is (primarily, at least) a declarative language ("platform, I need you to accomplish this end result") not a procedural one ("platform, I need you to process this data using these steps: for each...").
Memory might be your issue -- which storage engine do these tables use? ...but really, it seems like there are some more likely possibilities:
According to the EXPLAIN SELECT
you posted, no index is being used to join the maps
table (key
is NULL
), even though the mapExpireTime
index is available. When the optimizer determines that the index doesn't have sufficient selectivity, it won't be used -- and the more records in the maps table your @mapExpireTime would not eliminate, the less likely the optimizer will be to select that index, and instead opt to a full table scan, which may explain why performance falls off. On shorter date ranges, does the index appear in key
for the maps
table in EXPLAIN
? If so, that's the short answer to "why does it slow down?"
It's possible that ANALYZE TABLE maps;
might build some better index stats and improve the query performance. Conventional wisdom is that the constant "shiftiness" of InnoDB index stats makes this assertion untrue, but I've seen too many times where the stats on an InnoDB table are in a state that biases the optimizer towards a phenomenally bad query plan and ANALYZE TABLE
cleans it right up. (I think the reason the conventional wisdom on this is so widespread is the fact that SHOW TABLE STATUS
actually triggers a random index stats dive, so the act of looking makes what you intended to look at different than what it was before you looked, but I digress).
Another possibility would be that you could add an index (ID,mapExpireTime) on the maps table, which seems at first glance quite redundant, but might be used as a covering index... which would be far better than the full table scan and join buffer that's getting used now.
As @Razor commented, the number reported by EXPLAIN is only an estimate, and it can be off from your true row count by +/- 10% or even more.
MySQL prefers not to use an index at all if the portion of the table selected by the query conditions is too large. Think of the index at the back of a book. Why don't they index the word "the"? Because it would just list every page in the book. If the word occurs on too many pages in the book, it's actually easier to just read the book cover-to-cover instead of using an index.
Similarly, in MySQL if the optimizer estimates that the values you are searching for are found on a significantly high number of rows (in my experience "significant" is at least 16-18%) then MySQL skips the use of the index and just does a table-scan instead.
Sometimes this is the wrong choice, and you can give MySQL a hint that a table-scan is actually much more expensive than the preferred index choice:
EXPLAIN select * FROM playdays FORCE INDEX(realdate) where realdate>=Date('2010,01,01') ;
Be careful about using index hints too liberally, because you effectively hard-code your index choices into your queries by using them, and that may become the wrong strategy as your data changes over time. They should be used in exception cases, when you can demonstrate that the optimizer make the wrong choice.
Best Answer
A 10-row alter will be so fast that the "Algorithm" won't make a noticeable difference. You probably won't see a diff for 1000 rows.
Some history:
Originally
ALTER
had exactly one way of being performed under the hood. This made the code easy, but the performance not as good as it could be.Beginning with 5.6, there was a concerted effort to optimize specific cases -- mostly to avoid the full table copy. This led to a variety of syntaxes and options. Confusing.
Fortunately, if you ask for an algorithm that is not applicable to the action in question, it will spit at you. Simply change the algorithm and try again.
I think, without proof, that
ALTER TABLE
will use the best Algorithm available for the use case in hand. And the Algorithm options are there in case something goes wrong and you need to force a different algorithm.So why have
ALGORITHM
? I think it is like a lot ofVARIABLES
-- if something goes wrong, the end-user has the ability to "fix" the problem by turning off something. This does come into play in various evaluation optimzations. A visible case isFORCE INDEX
.Alas, the documentation of
ALGORITHM
is wimpy.