MySQL – Galera Takes Too Much Time for Pre-commit Stage

galeraMySQLperconaperformance

Our system has 3 Percona XtraDB servers, laid in the same network (we are using azure cloud). Our database has around 100 tables, ~120GB. The biggest table contains about 100M rows. Ping time between them is acceptable, i think (~ < 0.5ms).
I took a profiling test on all of my servers:

set profiling=1;
insert into links(id) values(1);
show profile for query 1;

But pre-commit stage of galera take too much time to finish (2.17s) this simple query. Result as bellow:

+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000023 |
| checking permissions           | 0.000004 |
| Opening tables                 | 0.000009 |
| init                           | 0.000007 |
| System lock                    | 0.000528 |
| updating                       | 0.002088 |
| end                            | 0.000004 |
| Waiting for query cache lock   | 0.000002 |
| end                            | 0.000006 |
| query end                      | 0.000005 |
| wsrep in pre-commit stage      | 2.172553 |
| ha_commit_one_phase(538401194) | 0.000006 |
| innobase_commit_low():trx_comm | 0.000029 |
| ha_commit_one_phase(538401194) | 0.005416 |
| wsrep in pre-commit stage      | 0.000024 |
| closing tables                 | 0.000007 |
| freeing items                  | 0.000147 |
| logging slow query             | 0.000003 |
| logging slow query             | 0.000047 |
| cleaning up                    | 0.000007 |
+--------------------------------+----------+

And too many warnings in the error log file complain about committed log fail:

2017-03-30 06:12:18 14342 [Warning] WSREP: Failed to report last committed 539015569, -4 (Interrupted system call)
2017-03-30 06:12:47 14342 [Warning] WSREP: Failed to report last committed 539018167, -4 (Interrupted system call)

Does anyone know how to reduce processing time for pre-commit stage or what causes the issue?

Thank you very much.


Here is wsrep_status
wsrep_local_cert_failures, wsrep_local_send_queue_avg, wsrep_local_recv_queue_avg seem to be too high

+------------------------------+----------------------------------------------------+
| Variable_name                | Value                                              |
+------------------------------+----------------------------------------------------+
| wsrep_local_state_uuid       | 8fadd81c-b664-11e6-9509-56a24df2cfb6               |
| wsrep_protocol_version       | 7                                                  |
| wsrep_last_committed         | 571341783                                          |
| wsrep_replicated             | 297503                                             |
| wsrep_replicated_bytes       | 218414523                                          |
| wsrep_repl_keys              | 2214789                                            |
| wsrep_repl_keys_bytes        | 24565584                                           |
| wsrep_repl_data_bytes        | 169717325                                          |
| wsrep_repl_other_bytes       | 0                                                  |
| wsrep_received               | 191252                                             |
| wsrep_received_bytes         | 112705971                                          |
| wsrep_local_commits          | 297345                                             |
| wsrep_local_cert_failures    | 53                                                 |
| wsrep_local_replays          | 18                                                 |
| wsrep_local_send_queue       | 96                                                 |
| wsrep_local_send_queue_max   | 155                                                |
| wsrep_local_send_queue_min   | 0                                                  |
| wsrep_local_send_queue_avg   | 15.290253                                          |
| wsrep_local_recv_queue       | 0                                                  |
| wsrep_local_recv_queue_max   | 10660                                              |
| wsrep_local_recv_queue_min   | 0                                                  |
| wsrep_local_recv_queue_avg   | 2860.841210                                        |
| wsrep_local_cached_downto    | 571192180                                          |
| wsrep_flow_control_paused_ns | 1903747659010                                      |
| wsrep_flow_control_paused    | 0.309630                                           |
| wsrep_flow_control_sent      | 1332                                               |
| wsrep_flow_control_recv      | 12696                                              |
| wsrep_cert_deps_distance     | 165.892497                                         |
| wsrep_apply_oooe             | 0.970206                                           |
| wsrep_apply_oool             | 0.000442                                           |
| wsrep_apply_window           | 16.287397                                          |
| wsrep_commit_oooe            | 0.000000                                           |
| wsrep_commit_oool            | 0.000442                                           |
| wsrep_commit_window          | 15.281639                                          |
| wsrep_local_state            | 4                                                  |
| wsrep_local_state_comment    | Synced                                             |
| wsrep_cert_index_size        | 1937                                               |
| wsrep_cert_bucket_count      | 7528                                               |
| wsrep_gcache_pool_size       | 134219009                                          |
| wsrep_causal_reads           | 0                                                  |
| wsrep_cert_interval          | 40.522988                                          |
| wsrep_incoming_addresses     | 10.0.0.13:3306,10.0.0.79:3306,10.0.0.52:3306       |
| wsrep_desync_count           | 0                                                  |
| wsrep_evs_delayed            |                                                    |
| wsrep_evs_evict_list         |                                                    |
| wsrep_evs_repl_latency       | 0.000288255/0.000769232/0.00570055/0.000398325/764 |
| wsrep_evs_state              | OPERATIONAL                                        |
| wsrep_gcomm_uuid             | 88432ffa-1911-11e7-b1d5-da9dc1aa2008               |
| wsrep_cluster_conf_id        | 38                                                 |
| wsrep_cluster_size           | 3                                                  |
| wsrep_cluster_state_uuid     | 8fadd81c-b664-11e6-9509-56a24df2cfb6               |
| wsrep_cluster_status         | Primary                                            |
| wsrep_connected              | ON                                                 |
| wsrep_local_bf_aborts        | 2                                                  |
| wsrep_local_index            | 1                                                  |
| wsrep_provider_name          | Galera                                             |
| wsrep_provider_vendor        | Codership Oy <info@codership.com>                  |
| wsrep_provider_version       | 3.20(r7e383f7)                                     |
| wsrep_ready                  | ON                                                 |

+——————————+—————————————————-+

Update 2: adding my.cnf

[mysqld]
skip-name-resolve
bind-address = 0.0.0.0
datadir=/mnt/mysql/data-main
user=mysql

# Path to Galera library
wsrep_provider=/usr/lib/libgalera_smm.so

#wsrep_cluster_address=gcomm://10.0.0.13,10.0.0.52

# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW

# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB

# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2

# Node #1 address
wsrep_node_address=10.0.0.6

# SST method
wsrep_sst_method=xtrabackup-v2

# Cluster name
wsrep_cluster_name=percona_xtradb

# Authentication for SST method
#wsrep_sst_auth="sstuser:rewjsf&*HKJnzPYuWbue"


# SAFETY #
max-allowed-packet             = 16M
max-connect-errors             = 1000000

# CACHES AND LIMITS #
tmp-table-size                 = 512M
max-heap-table-size            = 512M
query-cache-type               = 1
query-cache-size               = 512M
query_cache_limit          = 64M
max-connections                = 5000
thread-cache-size              = 150
open-files-limit               = 65535
table-definition-cache         = 4096
table-open-cache               = 1000
wait_timeout               = 600
connect_timeout = 60
interactive_timeout            = 3600
net_read_timeout=300
net_read_timeout=300
innodb_lock_wait_timeout = 120
tmpdir=/mnt/mysql/tmp
wsrep_retry_autocommit=2
wsrep_slave_threads=4

# INNODB #
innodb-flush-method            = O_DIRECT
innodb-log-files-in-group      = 2
#innodb-log-file-size           = 128M
innodb-log-file-size           = 1G
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table          = 1
innodb-buffer-pool-size        = 38G


long_query_time=1
slow_query_log=1
slow_query_log_file="/mnt/mysql/log/log-slow-queries.log"
log-queries-not-using-indexes=1

Best Answer

About your Error Log, From Percona Blog:

Reporting the last committed transaction is just a part of the certification index purge process. In case it fails for some reason (it occasionally does), the cert index purge may be a little delayed. But it does not mean the transaction was not applied successfully. This is a warning after all.

So It doesn't matter.

But about your Primary question:

Do you route all write queries into one server ?

I recommend to do this if you have heavy write for avoid deadlock and waiting because Galera requires all transactions to commit in order.

Update 1

From Fromdual Blog:

If you heavily rely on Query Cache you should first fix the queries and redesign your application. There is experimental Query Cache support since Galera v3.1. But the Galera documentation clearly states: Do not use query cache..

So maybe it causes the problem. Please disable query cache type on all of your servers and check it again.