By its Nature in a Galera Cluster a DML statement is expected to run a bit slower than a DML statement on a normal MySQL node because the response time of the DML statements includes not only the commit time on the local node but also sending the the write-set (ws) to the Group and receiving back the GTID of the ws.
So the total response time is: query time + 2 x round-trip to the group.
A simple insert will run possibly within 1 ms. And the round-trip to the group is possibly about 400 - 600 us. So 5 instead of 3 minutes I am not surprised to see...
Now, what can we improve?
a) Reduce query time:
Write performance is influenced by: innodb_buffer_pool_size (big enough to cache all data in memory), innodb_log_file_size (as bigger as better) and innodb_flush_log_at_trx_commit (0 or 2 are faster than 1). With your setting above you have not followed the Codership recommendation (innodb_flush_log_at_trx_commit=0).
b) Reduce number of round trips. When you batch your DML statements into transactions you have bigger but less write-sets which should cause less round trips. So your time spent in the network should become smaller.
c) Make your network faster. Use smallest possible network latency: dedicated network. 1 or 10 Gbit. Not much hardware in between (firewall, switches, routers, etc).
d) Parallelize your inserts?
Regards,
Oli
PS: There are still some places free in our Galera Cluster trainings in 2 weeks!
The binlog doesn't contain different structures when the storage engine is XtraDB vs InnoDB.
In the sense of interoperability, XtraDB is not an "extension" of InnoDB -- it's a fully-compatible replacement, that handles some internal operations differently but is still very much "InnoDB" wherever it is exposed outside the server core.
For what it's worth, I've taken "datadir" -- including ibdata1 and the ibd files from a stopped MySQL server, put them on a MariaDB server (which also uses XtraDB instead if InnoDB) and had zero compatibility issues.
The only caveat that comes to mind is the same as any MySQL async replication setup: the slave must be running the same or newer version of MySQL as its master (which, by the way, can be any one of thr PXC machines).
The reason for this "slave version the same or newer" rule is that the binlog format is forward-extensible only: when new capabilities are added to the binlog format, a newer master cam break an older slave, since the slave may not always understand what it is reading from the binlog, and replication will hit a hard stop (not a soft fail). A newer slave should never fail to understand what an older master writes to its binlog, since, officially, replication is supported across major releases (e.g., 5.1 to 5.5) but in practice, 5.1 to 5.6 also seems completely compatible.
The parallel minor version or newer, of MySQL Server, MariaDB, or -- seemingly the most foolproof -- Percona Server... should work fine as an async slave.
If you start out the slave with Identical data, and you replicate every table and schema, missing transactions should never happen. Don't even try to filter what replicates and try to replicate a subserlt of your schemata/tables unless and until you really understand the issues this can encounter. Replicating everything is the default behavior.
Yes... it is very much worth it to set up async slaves for reporting. The cluster members should already have the config they need for this to work (othet than the user account the slave will use to connect).
Best Answer
You appear to be missing the distinction between "Percona XtraDB" and "Percona XtraDB Cluster," also known as PXC. XtraDB is Percona's compatible drop-in replacement rewrite of the InnoDB storage engine, which is included in Percona Server and MariaDB.
Percona XtraDB Cluster and MariaDB Galera Cluster both use XtraDB as their storage engine, but the important difference from standard MySQL replication is that they both use the Galera Replication Provider, which provides true synchronous replication among all the nodes. You can also use Galera with MySQL. Oracle may not mention that, since it wasn't their idea.
MySQL's built-in asynchronous replication can be configured for circular replication (and is included in Percona and MariaDB) but it has no mechanism for handling conflicts. If queries make conflicting changes on different servers, your data will be inconsistent and replication will stop. Galera resolves this by requiring all nodes to concur on each commit. It's replication mechanism is fundamentally different than what is built in.
There are no issues with circular replication that are specific to MySQL 5.6 and 5.7 as your question implies. These issues apply to all versions of MySQL, Percona, and MariaDB when standard replication is used in a circular configuration, if you allow writes to be done to more than one of the masters.
See http://galeracluster.com