My mysql instance (ver 5.6.33) on AWS ( r3.4xlarge CPU Cores : 16 , RAM : 120 GB
, filesystem : XFS ) is crashing with signal 11 .
The instance is EBS optimised with EBS for storage.
At around 9:04 IST, the apport began writing the crash report. It did not finish till 9:29 where we killed apport ( kill -9 46697) and mysql respawned on its own.
PFA the top when we were about to kill apport.
PFA cpu and available memory graphs, also active number of connections.
Additional Info
#uname -a
Linux 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
# lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
Crash dump in /var/crash/
ProblemType: Crash
Architecture: amd64
Date: Mon Nov 21 09:04:26 2016
DistroRelease: Ubuntu 14.04
ExecutablePath: /usr/sbin/mysqld
ExecutableTimestamp: 1474318064
ProcCmdline: /usr/sbin/mysqld
ProcCwd: /var/lib/mysql
ProcEnviron:
PATH=(custom, no user)
TERM=linux
ProcMaps:
.... ---
ProcStatus:
Name: mysqld
State: D (disk sleep)
Tgid: 1591
Ngid: 0
Pid: 1591
PPid: 1
TracerPid: 0
Uid: 109 109 109 109
Gid: 116 116 116 116
FDSize: 16384
Groups: 116
VmPeak: 104777096 kB
VmSize: 104214224 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 98602088 kB
VmRSS: 98469508 kB
VmData: 104160476 kB
VmStk: 136 kB
VmExe: 14068 kB
VmLib: 4564 kB
VmPTE: 197320 kB
VmSwap: 0 kB
Threads: 3363
SigQ: 7/983541
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000087007
SigIgn: 0000000000001000
SigCgt: 00000001800062e9
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff
Cpus_allowed_list: 0-127
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 4249548
nonvoluntary_ctxt_switches: 168757
Signal: 11
Uname: Linux 3.13.0-92-generic x86_64
Mysql ERROR log didn't have any crash report ( NO LOGS on 9:04 )
2016-11-21 09:29:51 47911 [Note] Failed to execute mysql_file_stat on file '/var/log/mysql/mysql-bin.006775'
2016-11-21 09:29:51 47911 [Note] Plugin 'FEDERATED' is disabled.
2016-11-21 09:29:51 47911 [ERROR] Function 'innodb' already exists
2016-11-21 09:29:51 47911 [Warning] Couldn't load plugin named 'innodb' with soname 'ha_innodb.so'.
2016-11-21 09:29:51 47911 [ERROR] Function 'federated' already exists
2016-11-21 09:29:51 47911 [Warning] Couldn't load plugin named 'federated' with soname 'ha_federated.so'.
2016-11-21 09:29:51 47911 [ERROR] Function 'blackhole' already exists
2016-11-21 09:29:51 47911 [Warning] Couldn't load plugin named 'blackhole' with soname 'ha_blackhole.so'.
2016-11-21 09:29:51 47911 [ERROR] Function 'archive' already exists
2016-11-21 09:29:51 47911 [Warning] Couldn't load plugin named 'archive' with soname 'ha_archive.so'.
2016-11-21 09:29:51 47911 [Note] InnoDB: Using atomics to ref count buffer pool pages
2016-11-21 09:29:51 47911 [Note] InnoDB: The InnoDB memory heap is disabled
2016-11-21 09:29:51 47911 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2016-11-21 09:29:51 47911 [Note] InnoDB: Memory barrier is not used
2016-11-21 09:29:51 47911 [Note] InnoDB: Compressed tables use zlib 1.2.8
2016-11-21 09:29:51 47911 [Note] InnoDB: Using Linux native AIO
2016-11-21 09:29:51 47911 [Note] InnoDB: Using CPU crc32 instructions
2016-11-21 09:29:51 47911 [Note] InnoDB: Initializing buffer pool, size = 60.0G
2016-11-21 09:29:53 47911 [Note] InnoDB: Completed initialization of buffer pool
2016-11-21 09:29:54 47911 [Note] InnoDB: Highest supported file format is Barracuda.
2016-11-21 09:29:54 47911 [Note] InnoDB: Log scan progressed past the checkpoint lsn 22930589928913
2016-11-21 09:29:54 47911 [Note] InnoDB: Database was not shutdown normally!
2016-11-21 09:29:54 47911 [Note] InnoDB: Starting crash recovery.
2016-11-21 09:29:54 47911 [Note] InnoDB: Reading tablespace information from the .ibd files...
2016-11-21 09:29:56 47911 [Note] InnoDB: Restoring possible half-written data pages
2016-11-21 09:29:56 47911 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 22930595171328
InnoDB: Doing recovery: scanned up to log sequence number 22930600414208
InnoDB: Doing recovery: scanned up to log sequence number 22930605657088
InnoDB: Doing recovery: scanned up to log sequence number 22930610899968
InnoDB: Doing recovery: scanned up to log sequence number 22930616142848
InnoDB: Doing recovery: scanned up to log sequence number 22930621385728
InnoDB: Doing recovery: scanned up to log sequence number 22930626628608
InnoDB: Doing recovery: scanned up to log sequence number 22930631871488
InnoDB: Doing recovery: scanned up to log sequence number 22930637114368
InnoDB: Doing recovery: scanned up to log sequence number 22930642357248
InnoDB: Doing recovery: scanned up to log sequence number 22930647600128
InnoDB: Doing recovery: scanned up to log sequence number 22930652843008
InnoDB: Doing recovery: scanned up to log sequence number 22930658085888
InnoDB: Doing recovery: scanned up to log sequence number 22930663328768
InnoDB: Doing recovery: scanned up to log sequence number 22930668571648
InnoDB: Doing recovery: scanned up to log sequence number 22930673814528
InnoDB: Doing recovery: scanned up to log sequence number 22930679057408
InnoDB: Doing recovery: scanned up to log sequence number 22930684300288
InnoDB: Doing recovery: scanned up to log sequence number 22930689543168
InnoDB: Doing recovery: scanned up to log sequence number 22930694786048
InnoDB: Doing recovery: scanned up to log sequence number 22930700028928
InnoDB: Doing recovery: scanned up to log sequence number 22930705271808
InnoDB: Doing recovery: scanned up to log sequence number 22930710514688
InnoDB: Doing recovery: scanned up to log sequence number 22930715757568
InnoDB: Doing recovery: scanned up to log sequence number 22930721000448
InnoDB: Doing recovery: scanned up to log sequence number 22930726243328
InnoDB: Doing recovery: scanned up to log sequence number 22930731486208
InnoDB: Doing recovery: scanned up to log sequence number 22930736729088
InnoDB: Doing recovery: scanned up to log sequence number 22930741971968
InnoDB: Doing recovery: scanned up to log sequence number 22930747214848
InnoDB: Doing recovery: scanned up to log sequence number 22930752457728
InnoDB: Doing recovery: scanned up to log sequence number 22930757700608
InnoDB: Doing recovery: scanned up to log sequence number 22930762943488
InnoDB: Doing recovery: scanned up to log sequence number 22930763892518
InnoDB: Transaction 48125747425 was in the XA prepared state.
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 0 row operations to undo
InnoDB: Trx id counter is 48125747712
2016-11-21 09:30:02 47911 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2
SYSLOG at that time
Nov 21 09:05:01 localhost CRON[46747]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Nov 21 09:15:01 localhost CRON[47316]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Nov 21 09:17:01 localhost CRON[47334]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Nov 21 09:19:37 localhost dhclient: DHCPREQUEST of 10.0.42.189 on eth0 to 10.0.40.1 port 67 (xid=0x2441a7cd)
Nov 21 09:19:37 localhost dhclient: DHCPACK of 10.0.42.189 from 10.0.40.1
Nov 21 09:19:37 localhost dhclient: bound to 10.0.42.189 -- renewal in 1592 seconds.
Nov 21 09:25:01 localhost CRON[47648]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Nov 21 09:29:50 localhost kernel: [2888896.849170] init: mysql main process (1591) killed by SEGV signal
Nov 21 09:29:50 localhost kernel: [2888896.849184] init: mysql main process ended, respawning
Nov 21 09:29:50 localhost kernel: [2888896.886594] audit_printk_skb: 18 callbacks suppressed
Nov 21 09:29:50 localhost kernel: [2888896.886598] type=1400 audit(1479700790.336:18): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/sbin/mysqld" pid=47900 comm="apparmor_parser"
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48231]: Upgrading MySQL tables if necessary.
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48235]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48235]: Looking for 'mysql' as: /usr/bin/mysql
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48235]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48235]: Error: Failed while fetching Server version! Could be due to unauthorized access.
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48235]: FATAL ERROR: Upgrade failed
Nov 21 09:30:12 localhost /etc/mysql/debian-start[48245]: Checking for insecure root accounts.
my.cnf file
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
server-id = 42189
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
max-connect-errors = 100000
key_buffer_size = 30M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
myisam-recover-options = BACKUP
query_cache_type = 2
query_cache_limit = 2M
query_cache_size = 64M
log_error = /var/log/mysql/error.log
slow_query_log_file = '/var/log/mysql/slow.log'
slow_query_log = 1
log_output = 'FILE'
long_query_time = 2
expire_logs_days = 7
max_binlog_size = 100M
binlog_format = mixed
log-bin = /var/log/mysql/mysql-bin.log
log_slave_updates = 1
innodb_print_all_deadlocks=1
sql_mode = NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
innodb_buffer_pool_size = 60G
innodb_buffer_pool_instances = 20
innodb_log_file_size = 500M
innodb_log_buffer_size = 50M
innodb_flush_log_at_trx_commit = 2
innodb_buffer_pool_dump_at_shutdown = 1
innodb_buffer_pool_load_at_startup = 1
relay_log_recovery=1
relay_log_info_repository=TABLE
max_connections = 8000
max_heap_table_size = 64M
tmp_table_size = 64M
sort_buffer_size = 256K
join_buffer_size = 512K
table_open_cache = 4000
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[isamchk]
key_buffer_size = 30M
[mysqld_safe]
syslog
Update :
See comments by Michael ( https://dba.stackexchange.com/users/11651/michael-sqlbot ) in the question. We shut down apport and although we did not find a solution of SIGSEGV , apport did not make things worse.
Best Answer
A "load average" of 3000 says says
max_connections
is set unreasonably high, andSo...
my.cnf
; there may be other things that are set unreasonably.long_query_time=1
, turn on the slowlog, wait a day, runpt-query-digest
on the slowlog, and let's see the 'worst' queries, together withSHOW CREATE TABLE
. It may be something as simple as adding selected composite index(es) to table(s).innodb_buffer_pool_size
? It sounds like that could be increased, but it may not need to be.