Mysql 5.7.20 crashes after jemalloc installation

crashMySQL

Recently i faced memory leak issue in one my mysql instance (5.7.20) where eventhough allocated buffer pool size was 50% of the RAM, but mysqld memory utilization was constantly pegging at 90%.
I found similar bug https://bugs.mysql.com/bug.php?id=83047 and in my case also bulk load is the predominant workload.
So i installed jemalloc and made changes to /etc/sysconfig/mysql so the mysqld uses jemalloc instead of malloc().

My memory leak issue is fixed now. But after this change i am noticing that mysql crashes often and from error log i could not interpret what is exactly causing the crash.

01:08:08 UTC – mysqld got signal 11 ; This could be because you hit a
bug. It is also possible that this binary or one of the libraries it
was linked against is corrupt, improperly built, or misconfigured.
This error can also be caused by malfunctioning hardware. Attempting
to collect some information that could help diagnose the problem. As
this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608 read_buffer_size=131072
max_used_connections=18 max_threads=151 thread_count=16
connection_count=16 It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads =
338785 K bytes of memory Hope that's ok; if not, decrease some
variables in the equation.

Thread pointer: 0x7f6f6bc16000 Attempting backtrace. You can use the
following information to find out where mysqld died. If you see no
messages after this, something went terribly wrong… stack_bottom =
7f86e055ce30 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0xef8feb]
/usr/sbin/mysqld(handle_fatal_signal+0x461)[0x7b0191]
/lib64/libpthread.so.0(+0xf5e0)[0x7f86fee6b5e0]
/usr/sbin/mysqld(_Z32innobase_parse_hint_from_commentP3THDP12dict_table_tPK11TABLE_SHARE+0x2d0)[0xf2ed50] /usr/sbin/mysqld(_ZN19create_table_info_t24create_table_update_dictEv+0x119)[0xf41c09]
/usr/sbin/mysqld(_ZN11ha_innobase6createEPKcP5TABLEP24st_ha_create_information+0x127)[0xf436b7]
/usr/sbin/mysqld(_ZN11ha_innopart20create_new_partitionEP5TABLEP24st_ha_create_informationPKcjP17partition_element+0xcd)[0xf53aad]
/usr/sbin/mysqld(_ZN16Partition_helper17change_partitionsEP24st_ha_create_informationPKcPyS4_+0x489)[0xc255d9]
/usr/sbin/mysqld[0xcccdee]
/usr/sbin/mysqld(_Z26fast_alter_partition_tableP3THDP5TABLEP10Alter_infoP24st_ha_create_informationP10TABLE_LISTPcPKcP14partition_info+0x52c)[0xcd78cc]
/usr/sbin/mysqld(_Z17mysql_alter_tableP3THDPKcS2_P24st_ha_create_informationP10TABLE_LISTP10Alter_info+0xd43)[0xd309e3]
/usr/sbin/mysqld(_ZN19Sql_cmd_alter_table7executeEP3THD+0x4f8)[0xe2e648]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x5d0)[0xcc35d0]
/usr/sbin/mysqld(_ZN18Prepared_statement7executeEP6Stringb+0x357)[0xcf1397]
/usr/sbin/mysqld(_ZN18Prepared_statement12execute_loopEP6StringbPhS2_+0xda)[0xcf43ca]
/usr/sbin/mysqld(_Z22mysql_sql_stmt_executeP3THD+0xfc)[0xcf48ac]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x198d)[0xcc498d]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
/usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
/usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
/usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
/usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
/usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
/usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
/usr/sbin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3b5)[0xcc99a5]
/usr/sbin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0xa8a)[0xcca4aa]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x19f)[0xccbeef]
/usr/sbin/mysqld(handle_connection+0x288)[0xd8b668]
/usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0x126f4a4]
/lib64/libpthread.so.0(+0x7e25)[0x7f86fee63e25]
/lib64/libc.so.6(clone+0x6d)[0x7f86fd92034d]

Trying to get some variables. Some pointers may be invalid and cause
the dump to abort. Query (7f69a47be040): ALTER TABLE HN_QOS_DATA_0666
ADD PARTITION( partition p737487 values less than ( '2019-03-04' ))
Connection ID (thread ID): 51548 Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html
contains information that should help you find out what is causing the
crash.

and then the recovery starts

2019-02-02T01:08:26.053673Z 0 [Warning] Could not increase number of
max_open_files to more than 10000 (request: 10161)
2019-02-02T01:08:26.054509Z 0 [Warning] Changed limits:
table_open_cache: 4919 (requested 5000) 2019-02-02T01:08:26.252510Z 0
[Warning] The syntax '–log_warnings/-W' is deprecated and will be
removed in a future release. Please use '–log_error_verbosity'
instead. 2019-02-02T01:08:26.252626Z 0 [Warning] TIMESTAMP with
implicit DEFAULT value is deprecated. Please use
–explicit_defaults_for_timestamp server option (see documentation for more details). 2019-02-02T01:08:26.252704Z 0 [Warning] Insecure
configuration for –secure-file-priv: Current value does not restrict
location of generated files. Consider setting it to a valid, non-empty
path. InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

I have PMM monitoring for this instance, i can certainly see a spike in swapping, IO and spike in some other charts as well. But im not able come to a conclusion on what is causing the crash. Whether a particular query is causing or it is because of memory pressure or for some other reason.

Even during the memory leak issue in the server, mysql never crashed but after started using jemalloc, mysql just crashes .

1) what things should i look upon to find the exact cause of mysql crash (mysql and in PMM)
2) Does using jemalloc library causes mysql crashes
3) How can i rule out memory pressure as a cause of crash
4) Is it better to use tcmalloc() instead of jemalloc

Thanks in advance.

Best Answer

Thanks for for time.. Looks like jemalloc() and vm.swappiness=1 does not go well i guess. I used TCmalloc() library and increased vm.swappiness=30 and Mysql is not crashing now (earlier it was crashing every 3 days)