Mysql – MariaDB Galera Cluster – nodes fail to start

galeramariadbmysql-clusterreplication

This is a new installation (fresh VM and MariaDB install). For some reason, the nodes are having trouble joining the master server. The master server would pause at Preparing binlog files for transfer while this would result as a Starting MariaDB database server mysqld [fail] on the nodes.

Log from node-1 (hostname: vvalu_china_1)

azureuser@cv-server1:~$ sudo service mysql start
 * Starting MariaDB database server mysqld                                     [fail]
azureuser@cv-server1:~$ tail -f /var/log/mysql/mysql.log
141031  10:12:10 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
141031  10:12:10 [Note] WSREP: REPL Protocols: 5 (3, 1)
141031  10:12:10 [Note] WSREP: Service thread queue flushed.
141031  10:12:10 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
141031  10:12:10 [Note] WSREP: Service thread queue flushed.
141031  10:12:10 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (3c2b4e20-60a3-11e4-a3f2-4252be4415e1) does not match group state UUID (20156cae-60a2-11e4-b7d8-0abd6e366161): 1 (Operation not permitted)
         at galera/src/replicator_str.cpp:prepare_for_IST():447. IST will be unavailable.
141031  10:12:10 [Note] WSREP: Member 1.0 (vvalu_china_1) requested state transfer from '*any*'. Selected 0.0 (vvalu_master_local)(SYNCED) as donor.
141031  10:12:10 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
141031  10:12:10 [Note] WSREP: Requesting state transfer: success, donor: 0

Log from primary (hostname: vvalu_master_local)

141031 10:11:50 [Note] Added new Master_info '' to hash table
141031 10:11:50 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.14-MariaDB-1~trusty-wsrep-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution, wsrep_25.10.r4144
141031 10:11:50 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
141031 10:11:50 [Note] WSREP: REPL Protocols: 5 (3, 1)
141031 10:11:50 [Note] WSREP: Service thread queue flushed.
141031 10:11:50 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
141031 10:11:50 [Note] WSREP: Service thread queue flushed.
141031 10:11:50 [Note] WSREP: Synchronized with group, ready for connections
141031 10:11:50 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
141031 10:12:09 [Note] WSREP: declaring 516e8fc1-60a3-11e4-8fc0-3bde09ed89e0 stable
141031 10:12:09 [Note] WSREP: Node 467a1d82-60a3-11e4-8419-420e1bb83a75 state prim
141031 10:12:09 [Note] WSREP: view(view_id(PRIM,467a1d82-60a3-11e4-8419-420e1bb83a75,2) memb {
        467a1d82-60a3-11e4-8419-420e1bb83a75,0
        516e8fc1-60a3-11e4-8fc0-3bde09ed89e0,0
} joined {
} left {
} partitioned {
})
141031 10:12:09 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
141031 10:12:09 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 51d731ef-60a3-11e4-8ecd-5a0f37b3d4ad
141031 10:12:09 [Note] WSREP: STATE EXCHANGE: sent state msg: 51d731ef-60a3-11e4-8ecd-5a0f37b3d4ad
141031 10:12:09 [Note] WSREP: STATE EXCHANGE: got state msg: 51d731ef-60a3-11e4-8ecd-5a0f37b3d4ad from 0 (vvalu_master_local)
141031 10:12:10 [Note] WSREP: STATE EXCHANGE: got state msg: 51d731ef-60a3-11e4-8ecd-5a0f37b3d4ad from 1 (vvalu_china_1)
141031 10:12:10 [Note] WSREP: Quorum results:
        version    = 3,
        component  = PRIMARY,
        conf_id    = 1,
        members    = 1/2 (joined/total),
        act_id     = 0,
        last_appl. = 0,
        protocols  = 0/5/3 (gcs/repl/appl),
        group UUID = 20156cae-60a2-11e4-b7d8-0abd6e366161
141031 10:12:10 [Note] WSREP: Flow-control interval: [23, 23]
141031 10:12:10 [Note] WSREP: New cluster view: global state: 20156cae-60a2-11e4-b7d8-0abd6e366161:0, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3
141031 10:12:10 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
141031 10:12:10 [Note] WSREP: REPL Protocols: 5 (3, 1)
141031 10:12:10 [Note] WSREP: Service thread queue flushed.
141031 10:12:10 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
141031 10:12:10 [Note] WSREP: Service thread queue flushed.
141031 10:12:10 [Note] WSREP: Member 1.0 (vvalu_china_1) requested state transfer from '*any*'. Selected 0.0 (vvalu_master_local)(SYNCED) as donor.
141031 10:12:10 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 0)
141031 10:12:10 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
141031 10:12:10 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'donor' --address '30.30.30.30:4444/rsync_sst' --auth '(null)' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf'  --binlog '/var/log/mysql/mariadb-bin' --gtid '20156cae-60a2-11e4-b7d8-0abd6e366161:0''
141031 10:12:10 [Note] WSREP: sst_donor_thread signaled with 0
141031 10:12:10 [Note] WSREP: Flushing tables for SST...
141031 10:12:10 [Note] WSREP: Provider paused at 20156cae-60a2-11e4-b7d8-0abd6e366161:0 (5)
141031 10:12:10 [Note] WSREP: Tables flushed.
WSREP_SST: [INFO] Preparing binlog files for transfer: (20141031 10:12:10.533)
mariadb-bin.000024

After ~10 mins, the master server would time out and here is the remaining log: http://pastebin.com/jayQmN43

And finally, here is the config file from the primary server:

[mysqld]
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0

#galera settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="vvalu_replicator"
wsrep_cluster_address="gcomm://10.10.10.10,20.20.20.20,30.30.30.30"
wsrep_sst_method=rsync

wsrep_node_address="10.10.10.10"
wsrep_node_name="vvalu_master_local"

Note: OS – Ubuntu 14.04 x64/ MariaDB 10.0 Series

Best Answer

I think you should switch back to Maria DB 5.5 for Galera support.