You need to monitor that progress outside of all mysql client programs.
For this example, let's say your MyISAM table is named workingset in the database mydata and the datadir is the default, /var/lib/mysql
Sidenote : If you are using Windows, run this SQL command to get your datadir
SHOW VARIABLES LIKE 'datadir';
You will have three files representing the table
/var/lib/mysql/mydata/workingset.frm
/var/lib/mysql/mydata/workingset.MYD
/var/lib/mysql/mydata/workingset.MYI
Try a temp table approach to loading the table workingset
use mydata
DROP TABLE IF EXIST workingsetload;
CREATE TABLE workingsetload LIKE workingset;
ALTER TABLE workingsetload DISABLE KEYS;
Perform 7 million row insert into workingsetload here, preferably LOAD DATA INFILE
ALTER TABLE workingsetload ENABLE KEYS;
ALTER TABLE workingset RENAME workingsetold;
ALTER TABLE workingsetload RENAME workingset;
DROP TABLE workingsetold;
During the load of the table workingsetload, there will be a temp table whose first five(5) characters are "#sql-". You can visually monitor in the Linux OS the progress of the loading of workingsetload like this:
cd /var/lib/mysql/mydata
watch -n 1 "ls -l workingsetload.MY[ID] *sql-*.MY[ID] | awk '{print $5,$9}'"
You could script this in linux like this:
cd /var/lib/mysql/mydata
ls -l workingsetload.MY[ID] *sql-*.MY[ID] | awk '{print $5,$9}' > /tmp/IndexRebuild.txt
You could then parse the columns accordingly and subtract the .MYD sizes to know when the data portion of the MyISAM table is complete. Subsequently, you could then parse the columns accordingly and subtract the .MYI sizes to know when the index portion of the MyISAM table is complete.
For Windows, you could script this using Perl (ActivePerl for Windows) or just have a DOS Batch loop over and over again.
Regardless of OS, once the "ls -l #sql-*.MY[ID]" or "if not exist #sql-*.MY*" comes back with nothing, the reload and indexing are complete.
Give it a Try !!!
UPDATE 2011-07-20 16:55 EDT
Monty has successfully implemented this feature in MariaDB. Check it out !!!
Right now, you are in a very fortunate position. I noticed you have big-tables
defined. This is preventing you from experiencing "Table is Full" errors. Why is this good?
Whenever you get "Repair With Keycache" as a status, you have no free space to do file sorting. Making sort_buffer_size bigger isn't necessarily the answer since temp tables become disk files immediately.
You have two options
OPTION #1 : Increase Diskspace for datadir
The data volume where /var/lib/mysql
(or whatever datadir
is) resides may not have enough room to house a materialized temp table on disk. I would suggest increasing the disk volume's size to, at least, twice its size.
DRAWBACK : A one-time maintenance to move the database to the bigger disk.
OPTION #2 : Separate Disk for Temp Tables
Perhaps having a separate disk volume who sole purpose in life is to house temp tables should be set up. Try this
- Step 01 : Install a disk with the same size as the home of
datadir
- Step 02 :
mkdir /tmptables
- Step 03 : Mount the new disk volume to the folder
/tmptables
- Step 04 :
chown mysql:mysql /tmptables
- Step 05 : Add this to
my.cnf
- Step 06 :
service mysql restart
Once you make the disk and add tmpdir
, you should have more elbow room.
DRAWBACK : Transferring of the temp tables contents from /tmptables
back to the home of datadir
for certain SQL commands (such as DDL).
UPDATE 2013-05-21 00:10 EDT
You simply don't have enough memory. All the cores in the world cannot help MyISAM.
SUGGESTION #1 : Shorten your keys
I can tell from the keys that you are trying to retrieve all data from the indexes and avoid touching the table. All well and good IF IT WEREN'T FOR THE BILLIONS OF ROWS.
Here is something worth considering: Shorten the keys for each index
CREATE TABLE `research_storage1` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`word1` mediumint(8) unsigned NOT NULL,
`word2` mediumint(8) unsigned NOT NULL,
`origyear` smallint(5) unsigned NOT NULL,
`cat` tinyint(3) unsigned NOT NULL,
`pibn` int(10) unsigned NOT NULL,
`page` smallint(5) unsigned NOT NULL,
`pos` smallint(5) unsigned NOT NULL,
`num` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `pibnpage` (`pibn`,`page`),
KEY `word21` (`word2`,`word1`),
KEY `cat1` (`cat`,`word1`),
KEY `year1` (`origyear`,`word1`),
KEY `catyear1` (`cat`,`origyear`),
KEY `pibn` (`pibn`,`word1`)
) ENGINE=MyISAM DEFAULT CHARSET=ascii COLLATE=ascii_bin
DATA DIRECTORY='/storage/researchdb/'
INDEX DIRECTORY='/storage/researchdb/';
SUGGESTION #2 : Stop using ALTER TABLE...ENABLE KEYS;
You should try to get the MyISAM imported out into a new table without using ENABLE KEYS
.
CREATE TABLE `research_storagenew` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`word1` mediumint(8) unsigned NOT NULL,
`word2` mediumint(8) unsigned NOT NULL,
`origyear` smallint(5) unsigned NOT NULL,
`cat` tinyint(3) unsigned NOT NULL,
`pibn` int(10) unsigned NOT NULL,
`page` smallint(5) unsigned NOT NULL,
`pos` smallint(5) unsigned NOT NULL,
`num` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `pibnpage` (`pibn`,`page`,`word2`,`word1`),
KEY `word21pibn` (`word2`,`word1`,`pibn`,`num`),
KEY `word12num` (`word1`,`word2`,`num`),
KEY `cat1` (`cat`,`word1`),
KEY `year1` (`origyear`,`word1`),
KEY `catyear1` (`cat`,`origyear`,`word1`),
KEY `pibn` (`pibn`,`word1`)
) ENGINE=MyISAM DEFAULT CHARSET=ascii COLLATE=ascii_bin
DATA DIRECTORY='/storage/researchdb/'
INDEX DIRECTORY='/storage/researchdb/';
INSERT INTO `research_storagenew` SELECT * FROM `research_storage1`;
DROP TABLE `research_storage1`;
ALTER TABLE `research_storagenew` RENAME `research_storage1`;
SUMMARY
Look at the table definition again. There is 18 bytes for an index entry on just the pibnpage
index. That's 18G per billion rows. Same goes with word21pibn
. You just don't have enough room. It is imperative to try one of my newest suggestion to bypass the need to sort all these keys.
UPDATE 2013-05-22 12:15 EDT
YoU asked
About shortening the keys though: what kind of performance hit will this result in? Are we talking twice as long, 10x as long, 1000x as long?
I cannot say for sure what running time impact there will. However, I can say this: There may be some additional disk I/O because the indexes will no longer contain the needed column info. Queries will not have to turn to the .MYD
file to retrieve additional column information. Please keep in mind that MyISAM is suitable for heavy-read queries.
The only tuning I can further recommend if there is supposed to be INSERTs and DELETEs in the middle of the day during heavy-read periods would be to enable concurrent INSERTs.
[mysqld]
concurrent_insert=1
This will allow INSERTs into MyISAM without cross checking for free blocks within the table. This may make the table grow a little faster.
As far as maintenance goes, you must use SUGGESTION #2
as not rely so much on ALTER TABLE ... ENABLE KEYS;
on such bloated table. Perhaps you should think of moving /storage/researchdb/
to SSD.
Best Answer
For this example
First, collect all table names in mydb that are to be bulk loaded (one time setup)
For your on-demand bulk load, setup a table that tracks who has been loaded
Change the style of your script to do the following
Step01 :
DISABLE KEYS
on all tablesStep02 : Populate the tables
Step03 : Collect list of tables to process by doing the following:
In a separate script, load every table_name in mydb.TablesLeftToEnableKeys to bulk
ENABLE KEYS
Step04 : For each table_name in Step03, do the
ENABLE KEYS
What is so special about doing this? After Step03, you have a list of tables that need to have
ALTER TABLE ... ENABLE KEYS
executed.In the event of a crash, just rerun Step04 in that separate script because it will remember which tables did not execute or complete the execution of
ENABLE KEYS
Give it a Try !!!
UPDATE 2012-03-15 16:04 EDT
Unfortunately, there does not seem to be any API or direct SQL in MySQL to accommodate checking to see if there is a state or flag within the MyISAM table to detect if the keys for a MyISAM are in a disabled state. I am sure some MySQL Internals developer has that answer. Oh that's right. Oracle own MySQL now. Have fun looking for a developer.
If you are interesting in an ugly, but workable, method to speed
ENABLE KEYS
, there is something you can do with my answer.For each table you plan to populate do the following (for this example, let's use the MyISAM mydb.mytable):
CREATE TABLE mydb.mytable_copy LIKE mydb.mytable;
CREATE TABLE mydb.mytable DISABLE KEYS;
Linux commands
FLUSH TABLES;
REPAIR TABLE mydb.mytable;
This should be faster than
ENABLE KEYS;
I got this idea from an internal letter from MySQL