OBSERVATION #1
You have this:
CREATE TABLE mytable (id1 int, id2 int, score float) ENGINE=MyISAM;
LOAD DATA INFILE '50-billion-records.txt' INTO mytable (id1, id2, score);
ALTER TABLE mytable ADD INDEX id1_index (id1);
This is how you should load the MyISAM table:
CREATE TABLE mytable (id1 int, id2 int, score float,key (id1)) ENGINE=MyISAM;
ALTER TABLE mytable DISABLE KEYS;
LOAD DATA INFILE '50-billion-records.txt' INTO mytable (id1, id2, score);
ALTER TABLE mytable ENABLE KEYS;
When you do disable keys, it stops the .MYI
file from updating nonunique indexes.
During the LOAD DATA INFILE
, the .MYI
file will not grow because it contains no unique indexes or a primary key. This will promarily focus on loading the .MYD
.
The ENABLE KEYS
phase will do a read pass through the .MYD
and linearly build all the nonunique indexes. In your case, it will build the id1
index.
OBSERVATION #2
As for the bug you seem to be experiencing, think of this:
- 25 billions rows
- 4 byte integer for id1
- 100 billion bytes = 93.13G
- That's bigger than the 80G myisam_sort_buffer_size
IMPLICATIONS
- It is possible that incomplete or aborted indexing operations could have lost index pages that should have been written to the
.MYI
.
- It is also possible that doing
DISABLE KEYS
makes your query run because the id1
index is somehow being ignored. This could be the case since the EXPLAIN plan says type ALL
and Possible Keys NULL
. That's a full table scan. The SELECT query will work.
- With the keys enabled and incompletely written, some nonleaf node info being missing would lead to SELECT queries that are destined to fail because some of the
id1
values are not among the nonleaf node in the .MYI
.
SUGGESTIONS
Try running the LOAD DATA INFILE with my proposed code
CREATE TABLE mytable (id1 int, id2 int, score float,key (id1)) ENGINE=MyISAM;
ALTER TABLE mytable DISABLE KEYS;
LOAD DATA INFILE '50-billion-records.txt' INTO mytable (id1, id2, score);
ALTER TABLE mytable ENABLE KEYS;
Also, raise myisam_sort_buffer_size to 100G
Give it a Try !!!
UPDATE 2013-08-12 21:37 EDT
Repair By KeyCache is the result of hitting the max sort_buffer_size and myisam_sort_file_size. In turn, MySQL elects to sift through keys in the keycache.
There are three(3) options you could further adjust
OPTION #1
Set the sort_buffer_size to 16K. That's the minimum value allowed.
OPTION #2
Set the tmp_table_size to 1K. That's the minimum value allowed.
OPTION #3
Set the myisam_sort_buffer_size to its max value of 9223372036854775807
This should further prevent the Repair By KeyCache problem
Give it a Try !!!
SQL Server does not save previous index definitions for reuse at a later time. So query plans are based on the most recent statistics that were used for a compile.
Although stored procedures do not recompile constantly, they will in time recompile. You can also individually recompile procedures. The "sp_recompile" can be used to make your stored procedures recompile the next time they are used.
I see that you mention recreating statistics and recreating indexes, both of which help to get a correct set of statistics. You do not mention doing index reorganization, which also can be part maintaining the health of the indexes.
However, if the definition of the index has been changed to an earlier version, then it means that someone or some process changed them.
Best Answer
You can use the admin command:
To re-enable, use the same command with
true
.