MySQL 5.7.21 not releasing memory after writing partition to disk

amazon ec2memorymysql-5.7performanceperformance-testing

I've been testing extracting with SELECT * from a partition in a partitioned table in MySQL 5.7.21 and 5.7.22
I'm out of ideas and hitting my head on a wall. The goal is to write the data to S3 for other apps to consume.
I appreciate any feedback and please let me know if you need more information.

TL;DR The memory on the server never recovers and eventually hangs if enough extracts (select * to OS file) occur.

Notes:

  1. Using innodb file per table and no limits for mysql in /etc/limits.conf (/proc/pid/limits listed below)
  2. This is a dedicated server spun up just for this extraction testing.
  3. The server is a cloud EC2 r4.2xlarge 61GB 4VCPU optimized for memory.
  4. Initially the my.cnf settings were the same used as production where we don't have this load pattern (as it is new)

In my searching I've found some contradictory suggestions on how to tune for aspects of stalling. I don't think this is stalling behavior because memory is never reclaimed.

With grep Mem /proc/meminfo, top in batch mode, iostat, netstat -i to look at the system holistically I ruled out disk, network traffic saturation, file size written to disk.

I narrowed it down to just memory being the issue.

I can consistently

  1. stop mysql, restart it and memory is fully allocated.
  2. Extract data to OS file from a large or small partition, or several in series (These tests were with no binary logging, no compression on the output)
  3. After enough partitions are written to disk I get info to std out at the OS before system hanging "cannot fork"
  4. Restarting the server frees up memory and the extract process can be replayed again from where it left off.
  5. CPU can spike up to 100% with the larger extractions (anything above 10G) and the system doesn't stall on CPU.
  6. Flush table after extract has no affect. And parameter tweaking no affect (don't have the right combination/ don't know what combo comes into play)
  7. Uptime shows load is never above 2 for the 5,10,15 minute interval
  8. status command reports that Open tables never gets above 500 and open files limit is 5K

Peformance Schema Report from workbench states it is in the top 5 % of runtime. And that it is doing a FTS on that partition.

     Query, Full Table Scan, Executed (#), Errors (#), Warnings (#),
       Total Time, Maximum Time, Avg Time, Rows Sent (#),
       Avg. Rows Sent (#), Rows Scanned (#), Avg. Rows Scanned (#), Digest
     SELECT  *
    FROM  `schema` . `partitioned_table` PARTITION ( `date_partition` ) ,
        *, 1, 0, 0, 267744528.92, 267744528.92, 267744528.92,
        7446849, 7446849.0, 7446849, 7446849.0, fcb788d6ea76c986d767b282efa1ca11

I looked at MySQL not releasing memory and some MySQL reported bugs.

Ulimits on the /proc/(pid_of_mysqld)/limits

cat /proc/30613/limits

   Limit                     Soft Limit           Hard Limit           Units
   Max cpu time              unlimited            unlimited            seconds
   Max file size             unlimited            unlimited            bytes
   Max data size             unlimited            unlimited            bytes
   Max stack size            8388608              unlimited            bytes
   Max core file size        0                    unlimited            bytes
   Max resident set          unlimited            unlimited            bytes
   Max processes             245523               245523               processes
   Max open files            5000                 5000                 files
   Max locked memory         65536                65536                bytes
   Max address space         unlimited            unlimited            bytes
   Max file locks            unlimited            unlimited            locks
   Max pending signals       245523               245523               signals
   Max msgqueue size         819200               819200               bytes
   Max nice priority         0                    0
   Max realtime priority     0                    0
   Max realtime timeout      unlimited            unlimited            us

Ulimits of the mysql_safe process

cat /proc/30023/limits

    Limit                     Soft Limit           Hard Limit           Units
    Max cpu time              unlimited            unlimited            seconds
    Max file size             unlimited            unlimited            bytes
    Max data size             unlimited            unlimited            bytes
    Max stack size            8388608              unlimited            bytes
    Max core file size        0                    unlimited            bytes
    Max resident set          unlimited            unlimited            bytes
    Max processes             245523               245523               processes
    Max open files            1024                 4096                 files
    Max locked memory         65536                65536                bytes
    Max address space         unlimited            unlimited            bytes
    Max file locks            unlimited            unlimited            locks
    Max pending signals       245523               245523               signals
    Max msgqueue size         819200               819200               bytes
    Max nice priority         0                    0
    Max realtime priority     0                    0
    Max realtime timeout      unlimited            unlimited            us

Looking at /proc/meminfo compared at lowest point before allowed to hang to clean (below is showing what the diff in values are at when they are at their lowest – not the entire list from /proc/meminfo)

grep -Fxvf lowest_meminfo clean_meminfo

    MemFree:        27793236 kB
    MemAvailable:   54872272 kB
    Buffers:           71980 kB
    Cached:         27404904 kB
    Active:         11688912 kB
    Inactive:       22872372 kB
    Active(anon):    7084432 kB
    Active(file):    4604480 kB
    Inactive(file): 22872320 kB
    Dirty:                24 kB
    AnonPages:       7084488 kB
    Mapped:            41360 kB
    Slab:             335776 kB
    SReclaimable:     309948 kB
    SUnreclaim:        25828 kB
    KernelStack:        3792 kB
    PageTables:        21028 kB
    Committed_AS:   55038256 kB

The error log on this system and other prod systems has noted this at times of extract, mysqldump, or load outfile/infile ; which had led me articles about stalling.

   [Note] InnoDB: page_cleaner: 1000ms intended loop took 11681ms. The settings might not be optimal.

Best Answer

What happened in this case was that the /etc/my.cnf file did not have the flag quick in the [client] section.

With meminfo, iostat, netstat, top., uptime and a while loop to look at the timestamps of the files being written to disk -- I noticed that the file would only be written out to disk at the very end of the extract process.

The filesystem ext4 was not in the mix either as iostat didn't show it pegging hard.

Though the memory still dips as a result. The amount being read into memory to write to disk is bufffered (chunked) and the result is that the system can handle multiple Gigabyte sized output files.