Mysql – Inserting into thesql table with archive engine “duplicate key” error

archiveMySQLstorage-engine

I am trying to archive some files using mysql archive engine. I am using this query to insert file contents:

insert into test_table (id,arch) values (123,'FILE_CONTENT')

After inserting 2 records, I get "duplicate key" error from mysql for a key that doesn't exist in the table. I checked it like 5 times but the record is not there. I selected count(*) for the duplicate id and the result is 0.

I checked the same code with an innodb engine and it works fine. Can anyone tell me what the problem is with the archive engine?

CREATE TABLE `test_table` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `arch` mediumtext,
  PRIMARY KEY (`id`)
) ENGINE=ARCHIVE AUTO_INCREMENT=100175977 DEFAULT CHARSET=utf8mb4;

Best Answer

This sounds very unusual for a table using the ARCHIVE Storage Engine. Why? A duplicate key error is not characteristic for ARCHIVE Storage Engine since

Engine does not support the creation of indexes
Engine supports INSERTs and SELECTs

Surprisingly, there can be a key internally present. How?

According to the MySQL Documentation

The ARCHIVE engine supports the AUTO_INCREMENT column attribute. The AUTO_INCREMENT column can have either a unique or nonunique index. Attempting to create an index on any other column results in an error. The ARCHIVE engine also supports the AUTO_INCREMENT table option in CREATE TABLE and ALTER TABLE statements to specify the initial sequence value for a new table or reset the sequence value for an existing table, respectively.

Given this information, look back at the table and the query

insert into test_table (id,arch) values (123,'FILE_CONTENT')

If the id column has the AUTO_INCREMENT attribute, you should not specify id with a value. It would produce a normal 1062 error (Duplicate Key) for other Storage Engines.

SUGGESTIONS

Change the insert to a format that can handle the AUTO_INCREMENT attribute of id

insert into test_table (id,arch) values (0,'FILE_CONTENT')

insert into test_table (arch) values ('FILE_CONTENT')

Give it a Try !!!

UPDATE 2013-08-06 16:57 EST

If you are planning to do queries from the archive table, you need to get away from the ARCHIVE Storage Engine. Why? Again, according to the MySQL Documentation

Retrieval: On retrieval, rows are uncompressed on demand; there is no row cache. A SELECT operation performs a complete table scan: When a SELECT occurs, it finds out how many rows are currently available and reads that number of rows. SELECT is performed as a consistent read. Note that lots of SELECT statements during insertion can deteriorate the compression, unless only bulk or delayed inserts are used

Note that every SELECT against an ARCHIVE table is a full table scan. If you lookup id 123 with 1,000,000 rows, you gotta read 1,000,000 rows every time.

SUGGESTION

Convert the table to MyISAM. Then you can have a proper index on id plus the ability to create other indexes on other columns as needed.

CREATE TABLE test_table_myisam ENGINE=MyISAM as SELECT * FROM test_table WHERE 1=2;
ALTER TABLE test_table_myisam ADD PRIMARY KEY (id);
INSERT IGNORE INTO test_table_myisam SELECT * FROM test_table;
DROP TABLE test_table;
ALTER TABLE test_table_myisam RENAME test_table;

I have very bad news for you.

You should not have deleted the ibdata1 file. Here is why:

ibdata1 contains four type of information:

table metadata
MVCC data
data pages (with innodb_file_per_table enabled)
index pages (with innodb_file_per_table enabled)

Each InnoDB table created has a numercial id assigned to it via some auto increment metadata feature to each ibd file. That internal tablespace id (ITSID) is embedded in the .ibd file. That number is checked against the list of ITSIDs maintained, guess where, ... ibdata1.

I also have very good news for you along with some bad news.

It is possible to reconstruct ibdata1 to have the correct ITSIDs but it takes work to do it. While I personally have not done procedure alone, I assisted a client at my employer's web hosting to do this. We figured this out together but since the client hosed ibdata1, I let him do most of the work (30 InnoDB tables).

Anyway, here a past post I made in the DBA StackExchange. I answered another question whose root cause was the mixing up of ITSIDs.

To cut right to the chase, here is the article explaining what to do with reference to ITSID and how to massage ibdata1 into acknowledging the presence of the ITSID contained within the .ibd file.

I am sorry there is no quick-and-dirty method for recovering the .ibd file other than playing games with ITSIDs.

UPDATE 2011-10-17 06:19 EDT

Here is your original innodb configuration from your question:

innodb_file_per_table=1
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_buffer_pool_size=4G
innodb_data_file_path=ibdata1:10M:autoextend
innodb_buffer_pool_size = 384M
innodb_log_file_size=5M
innodb_lock_wait_timeout = 18000

Please notice that innodb_log_file_size is there twice. Look carefully...

innodb_file_per_table=1
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G <----
innodb_buffer_pool_size=4G
innodb_data_file_path=ibdata1:10M:autoextend
innodb_buffer_pool_size = 384M
innodb_log_file_size=5M <----
innodb_lock_wait_timeout = 18000

The last setting of innodb_log_file_size takes precedence. MySQL expected to start up with the log files being 5M. Your ib_logfile0 and ib_logfile1 were 1G when you tried to start up mysqld. It saw a size conflict and took the path of least resistance, which was to disable InnoDB. That's why InnoDB was missing from show engines;. Mystery solved !!!

UPDATE 2011-10-17 11:07 EDT

The error message was deceptive because innodb_log_file_size was smaller than the log files (ib_logfile0 and ib_logfile1), which were 1G at the time. What's interesting is this: Corruption was reported because the file was expected to be 5M and the files were bigger. If the situation were reversed and the innodb log files were smaller than the declared size in my.cnf you should get something like this in the error log:

110216 9:48:41 InnoDB: Initializing buffer pool, size = 128.0M
110216 9:48:41 InnoDB: Completed initialization of buffer pool
InnoDB: Error: log file ./ib_logfile0 is of different size 0 5242880 bytes
InnoDB: than specified in the .cnf file 0 33554432 bytes!
110216 9:48:41 [ERROR] Plugin 'InnoDB' init function returned error.
110216 9:48:41 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.

In this example, the log files were already existing as 5M and the setting for innodb_log_file_size was bigger (in this case, 32M).

For this particular question, I blame MySQL (eh Oracle [still hate saying it]) for the inconsistent error message protocol.

Mysql – Are two indexes needed

An index can seek by a subset of characters, as long as you're searching from the left. E.g., "Inter%" can seek, "%net" will not.

However, the first character is not necessarily the character under which the article would be sorted. "The Internet" should go under "I", not "T". You probably need two fields, DisplayTitle and SortTitle; a single-character index on the latter may be worthwhile, but most likely a full-length index will be just fine.

Indexes are typically B-trees, and a seek will jump to the right location about equally quickly whether you have 10 or 100 entries per page. Scans are another matter, but I'd start with the simplest solution and add an extra index only if performance proves inadequate in practice.