This makes all the sense in the world to me.
InnoDB creates data pages and index pages that are 16K each. If rows of data are being inserted, updated, deleted, committed, and rolled back, you are going to have FRAGMENTATION !!!
There are two cases where you can have internal fragmentation:
- A single row could be written in multiple data pages because certain column values make a row too big to fit in the data page.
- Having a TEXT column with 32K of data in it.
In those two cases, a single row spanning multiple data pages would have to be chained like a linked list. The internally generated list of data pages would always have to be navigated when the row is read.
Giving credit where credit is due, PostgreSQL implemented a very brilliant mechanism called TOAST (The Oversized-Attribute Storage Technique) to keep oversized data outside of tables to stem the tide of this kind of internal fragmentation.
Have used mysqldump to make a file with CREATE TABLE statements, followed by lots of INSERTs, you get a fresh table with no unused space along with contiguous data and index pages when loading the mysqldump into a new server.
For my explanantions, let's assume you have an InnoDB table in the CUSTODIA database called userinfo
If you would like to compress a table, you have three(3) options
Option 1
OPTIMIZE TABLE CUSTODIA.userinfo;
Option 2
ALTER TABLE CUSTODIA.userinfo ENGINE=InnoDB;
Option 3
CREATE TABLE CUSTODIA.userinfo2 LIKE CUSTODIA.userinfo;
INSERT INTO CUSTODIA.userinfo2 SELECT * FROM CUSTODIA.userinfo;
DROP TABLE CUSTODIA.userinfo;
ALTER TABLE CUSTODIA.userinfo2 RENAME CUSTODIA.userinfo;
CAVEAT : Option 3 is no good on a table with constraints. Option 3 is perfect for MyISAM.
Now for your questions:
Question 1. Why is there this difference between original and restored database size?
As explained above
Question 2. Is it safe to assume that restored database is OK, although this difference in size?
If you want to make absolutely sure that the data on both servers are identical, simply run this command on both DB servers:
CHECKSUM TABLE CUSTODIA.userinfo;
Hopefully, the checksum value is identical for the same table on both servers. If you have dozens, even hundreds, of tables, you may have to script it.
Question 3 : How does MySQL calculate data_length? Is it an estimate?
You are using the correct method in summing up the data_length and index_length. Based on my explanation of fragmentation, it is an estimate.
Question 4. Can I safely reduce production's ibdata file size to 3.6GiB with no down-time?
GOOD NEWS !!! You absolutely can compress it !!! In fact, how would like to compress it to a fraction of that number ??? Follow these two links because I addressed this issue in StackOverflow and ServerFault.
https://stackoverflow.com/questions/3927690/howto-clean-a-mysql-innodb-storage-engine/4056261#4056261
https://serverfault.com/questions/230551/mysql-innodb-innodb-file-per-table-cons/231400#231400
BAD NEWS !!! Sorry, but you will have a 3-5 minute window of downtime for rebuilding ib_logfile0 and ib_logfile1 as well shrinking ibdata1 once and for all. It's well worth it since it will be a one-time operation.
It doesn't really make sense that you're able to do the backup with the --tab
option but not without it, and I think that should be investigated.
Concerns about integrity are generally concerns with the backup, and you should be concerned about the backup, because --single-transaction
issues a START TRANSACTION WITH CONSISTENT SNAPSHOT
before the backup, ensuring that what gets written to the file represents the state of all the tables at one consistent point in time.
Without this, your "user" table, being alphabetically later than your "history" table could (for example) in the backup file contain a new user but not have caught history of that account's creation, since the backup of "history" finished before the backup of "user" was started.
You say you want to restore within a transaction, but that's not possible (by default) and probably not really what you intend at any rate.
It's not possible, by default, because of the way mysqldump
works. Examine a backup file
DROP TABLE IF EXISTS `action`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `action` (
`id` int(11) NOT NULL ...
Both CREATE TABLE
and DROP TABLE
cause an implicit commit. This isn't the only concern, because non-conflicting writes to the tables could still be done by other threads.
There are two other ways to prevent changes on a MySQL server:
FLUSH TABLES WITH READ LOCK;
This attempts to close all tables and acquire a global lock that prevents any thread from writing. The problem here again in with the standard behavior of mysqldump
, and what it writes to the dump file for execution during the restore:
LOCK TABLES `action` WRITE;
/*!40000 ALTER TABLE `action` DISABLE KEYS */;
/*!40000 ALTER TABLE `action` ENABLE KEYS */;
UNLOCK TABLES;
These individual table locks are not compatible with the global lock:
mysql> flush tables with read lock;
Query OK, 0 rows affected (0.16 sec)
mysql> lock tables t1 write;
ERROR 1223 (HY000): Can't execute the query because you have a conflicting read lock
You can disable this behavior with --skip-add-locks
but then you wouldn't be able to write data into the tables if you had the global read lock because it prevents you from writing as well as other threads... so that one seems to be a non-starter.
Alternately, you could force the entire server to read_only
mode.
FLUSH TABLES WITH READ LOCK; -- force a wait until all ongoing writes have finished
SET GLOBAL READ_ONLY = 1; -- set entire server to read-only except for users with `SUPER` privilege
UNLOCK TABLES; -- release the global read lock
--- restore backup
SET GLOBAL READ_ONLY = 0;
If you're concerned with other threads modifying the tables during the restore, but you don't want the entire server to be read-only for the duration of the operation, I only see one really solid option, and that would be to restore all of the tables with an alternate name, and then, in a single SQL statement, rename them all to their correct names.
When multiple tables are renamed in a single statement, the entire statement is atomic.
RENAME TABLE t1 TO t1_old, t1_new to t1;
This acquires a global mutex that prevents t1 from being accessed until all of the renames are complete... but it would require manipulating your dump file to give the tables alternate names, thought it might also work if you restored to an alternate database name and renamed the tables in a fully-qualified form.
Another approach would be to drop and add all of the tables and lock all of them for write, before inserting the data, after removing any LOCK
and UNLOCK
from the dump file(s). The one additional step you'd need to do, here, would be after acquiring the locks, you'd need to run through every table and delete any data that got inserted by other threads between the time you created the table and acquired the locks, or the restore would fail. SET FOREIGN_KEY_CHECKS = 0
is a session-level setting that would allow your session to freely remove any stray data without regard to foreign key constraints.
The bottom line is that there is not a simple, straightforward way to keep all hands off of the data you are attempting to restore, and if faced with that requirement, I would be inclined to shut down the application server or otherwise prevent access to the server with a mechanism outside of mysql
itself, such as IP-level filtering.
Best Answer
I would like to say that as I am able to restore the backup of DB1 in DB2 with the same host. When I took the mysqldump backup of DB1 from the
SQLyog ultimate tool
. During the mysqldump backup window just exclude these statements. Which is as follows:USE database
statementCREATE database
statementAfter excluding these statements in dump file , I am able to restore the dump backup file at DB2 (Different database name with mysqldump database name) in same host name.
Hope it will helpful to other person.