Mysql – WSREP: Writeset deserialization failed: Unsupported RecordSet version: 2: 71 (Protocol error)

galeramariadbMySQLwsrep

We are running a Mariadb galera cluster with 3 data nodes. Recently one of them crashed and I had to do a SST (State Snapshot Transfer). Nothing I haven't done or seen before. However, after the SST was completed the process crashed on the following WSREP error:

[ERROR] WSREP: Writeset deserialization failed: Unsupported RecordSet version: 2: 71 (Protocol error)
          at galerautils/src/gu_rset.cpp:header_version():272
          at galera/src/trx_handle.cpp:unserialize():268

complete logs:

mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] InnoDB: 128 rollback segment(s) are active.
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] InnoDB: Waiting for purge to start
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.39-83.1 started; log sequence number 335729536918
mysqld[1624]: 2018-10-24 15:03:41 139764732757760 [Note] InnoDB: Dumping buffer pool(s) not yet started
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] Plugin 'FEEDBACK' is disabled.
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] Starting crash recovery...
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] Crash recovery finished.
mysqld[1624]: 2018-10-24 15:03:41 139769088514304 [Note] Server socket created on IP: '0.0.0.0'.
mysqld[1624]: 2018-10-24 15:03:42 139769088514304 [Note] WSREP: Signalling provider to continue.
mysqld[1624]: 2018-10-24 15:03:42 139769088514304 [Note] WSREP: SST received: 1b859078-cacc-11e8-8e3e-4381b13e7545:4661474
mysqld[1624]: 2018-10-24 15:03:42 139769088514304 [Note] Reading of all Master_info entries succeded
mysqld[1624]: 2018-10-24 15:03:42 139769088514304 [Note] Added new Master_info '' to hash table
mysqld[1624]: 2018-10-24 15:03:42 139769088514304 [Note] /usr/sbin/mysqld: ready for connections.
mysqld[1624]: Version: '10.1.36-MariaDB-1~xenial'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
mysqld[1624]: 2018-10-24 15:03:42 139769088199424 [ERROR] WSREP: Writeset deserialization failed: Unsupported RecordSet version: 2: 71 (Protocol error)
mysqld[1624]:          at galerautils/src/gu_rset.cpp:header_version():272
mysqld[1624]:          at galera/src/trx_handle.cpp:unserialize():268
mysqld[1624]: WS flags:      0
mysqld[1624]: Trx proto:     3
mysqld[1624]: Trx source:    00000000-0000-0000-0000-000000000000
mysqld[1624]: Trx conn_id:   18446744073709551615
mysqld[1624]: Trx trx_id:    18446744073709551615
mysqld[1624]: Trx last_seen: -1
mysqld[1624]: 2018-10-24 15:03:42 139769088199424 [ERROR] WSREP: Unsupported RecordSet version: 2: 71 (Protocol error)
mysqld[1624]:          at galerautils/src/gu_rset.cpp:header_version():272
mysqld[1624]:          at galera/src/trx_handle.cpp:unserialize():268
mysqld[1624]: 2018-10-24 15:03:42 139769088199424 [Note] WSREP: applier thread exiting (code:7)
mysqld[1624]: 2018-10-24 15:03:42 139769088199424 [ERROR] WSREP: node consistency compromised, aborting
mysqld[1624]: 2018-10-24 15:03:42 139769088199424 [Note] WSREP: starting shutdown
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] /usr/sbin/mysqld: Normal shutdown
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Stop replication
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closing send monitor...
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closed send monitor.
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: terminating thread
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: joining thread
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: closing backend
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: 3.0 (dbserver08): State transfer from 2.0 (DB) complete.
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Shifting JOINER -> JOINED (TO: 4666631)
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: view(view_id(NON_PRIM,2b30f5eb,167) memb {
mysqld[1624]:         7bffa222,0
mysqld[1624]: } joined {
mysqld[1624]: } left {
mysqld[1624]: } partitioned {
mysqld[1624]:         2b30f5eb,0
mysqld[1624]:         30d0c86b,0
mysqld[1624]:         5a7b95e7,0
mysqld[1624]:         b0f6da74,0
mysqld[1624]:         f7e81556,0
mysqld[1624]: })
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: view((empty))
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: gcomm: closed
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Flow-control interval: [16, 16]
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Trying to continue unpaused monitor
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Received NON-PRIMARY.
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Shifting JOINED -> OPEN (TO: 4666631)
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Received self-leave message.
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Flow-control interval: [0, 0]
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Trying to continue unpaused monitor
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Received SELF-LEAVE. Closing connection.
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 4666631)
mysqld[1624]: 2018-10-24 15:03:42 139768797067008 [Note] WSREP: RECV thread exiting 0: Success
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: recv_thread() joined.
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closing replication queue.
mysqld[1624]: 2018-10-24 15:03:42 139769010707200 [Note] WSREP: Closing slave action queue.
mysqld[1624]: 2018-10-24 15:03:44 139769088502528 [Note] WSREP: rollbacker thread exiting
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] Event Scheduler: Purging the queue. 0 events
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: dtor state: JOINING
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: apply mon: entered 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: apply mon: entered 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: mon: entered 1 oooe fraction 0 oool fraction 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: cert index usage at exit 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: cert trx map usage at exit 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: deps set usage at exit 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: avg deps dist 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: avg cert interval 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: cert index size 0
mysqld[1624]: 2018-10-24 15:03:44 139768863397632 [Note] WSREP: Service thread queue flushed.
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: wsdb trx map usage 0 conn query map usage 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: MemPool(LocalTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Note] WSREP: MemPool(SlaveTrxHandle): hit ratio: 0, misses: 1, in use: 1, in pool: 0
mysqld[1624]: 2018-10-24 15:03:44 139769010707200 [Warning] WSREP: Waiting for 5168 items to be fetched.

What does this WSREP error means?

I tried googling this specific error message but nothing came up.

I also checked mariadb versions differences between the nodes as I recently had to reinstall a node. But I couldn't spot any difference between them.

Best Answer

You have different versions of the Galera plugin, not the MariaDB server. I had the same issue during deployment, one node was updated and the other two not.

Please check on all nodes:

(none)> show global variables like 'wsrep_patch_version';
+---------------------+-------------+
| Variable_name       | Value       |
+---------------------+-------------+
| wsrep_patch_version | wsrep_25.23 |
+---------------------+-------------+
1 row in set
Time: 0.589s
(none)> select VERSION();
+-----------------+
| VERSION()       |
+-----------------+
| 10.1.34-MariaDB |
+-----------------+
1 row in set
Time: 0.008s

Depending on your automation and Linux distribution you need to get exact matching package versions of wsrep plugin and MariaDB server.