Centos – Pulling all of a process’s swapped memory out of swap


How would one go about quickly pulling all of a process's swapped memory out of swap without writing to disk?

The context on this issue is trivial, as the systemic issue necessitating the question is being handled by other parties. However, right now, I have an issue where I frequently have to free up swap space on an OpenVZ node while load and IO wait are extremely high.

The swap is often primarily consumed by a small handful of MySQL and clamd processes running on individual containers. Restarting these services frees the swap and solves the problem on the node, but is undesirable for obvious reasons.

I'm looking for a way to quickly free up the swap from those processes while the node is overloaded and need something faster than my current method:

unswap(){ [[ $1 && $(ls /proc/$1/maps) ]]  && ((gcore -o /tmp/deleteme $1 &>/dev/null; rm -fv /tmp/deleteme.$1)&) 2>/dev/null  || echo "must provide valid pid";};unswap

This core dump forces all ram to be accessed and thus does the job of pulling it out of swap, but I've yet to find a way to avoid its writing to file. Also, it seems like the process would be faster if I could isolate the address ranges that are currently swapped and just dump that portion to /dev/null, but I've yet to find a way to do that.

This is a huge node, so the usual swapoff/swapon method is prohibitively time consuming, and again, the node's configuration is not under my control, so fixing the root cause is not part of this question. However, any insight into how I could free up a significant portion of swap quickly without killing/restarting anything would be appreciated.

Environment: CentOS 6.7/OpenVZ

Update for anyone that may stumble on this later:

Using Jlong's input, I created the following function:

unswap(){ (awk -F'[ \t-]+' '/^[a-f0-9]*-[a-f0-9]* /{recent="0x"$1" 0x"$2}/Swap:/&&$2>0{print recent}' /proc/$1/smaps | while read astart aend; do gdb --batch --pid $1 -ex "dump memory /dev/null $astart $aend" &>/dev/null; done&)2>/dev/null;};

It's a bit slow, but does exactly what was requested here otherwise. Could probably improve the speed by finding only the largest address ranges in swap, and omitting the iterations for the trivially small areas, but the premise is sound.

Working example:

#Find the process with the highest swap use
[~]# grep VmSwap /proc/*/status 2>/dev/null | sort -nk2 | tail -n1 | while read line; do fp=$(echo $line | cut -d: -f1); echo $line" "$(stat --format="%U" $fp)" "$(grep -oP "(?<=NameS).*" $fp); done | column -t
/proc/6225/status:VmSwap:   230700  kB  root  mysqld

#Dump the swapped address ranges and observe the swap use of the proc over time
[~]# unswap(){ (awk -F'[ t-]+' '/^[a-f0-9]*-[a-f0-9]* /{recent="0x"$1" 0x"$2}/Swap:/&&$2>0{print recent}' /proc/$1/smaps | while read astart aend; do gdb --batch --pid $1 -ex "dump memory /dev/null $astart $aend" &>/dev/null; done&)2>/dev/null;}; unswap 6225; while true; do grep VmSwap /proc/6225/status; sleep 1; done
VmSwap:   230700 kB
VmSwap:   230700 kB
VmSwap:   230676 kB
VmSwap:   229824 kB
VmSwap:   227564 kB
... 36 lines omitted for brevity ... 
VmSwap:     9564 kB
VmSwap:     3212 kB
VmSwap:     1876 kB
VmSwap:       44 kB
VmSwap:        0 kB

Final solution for bulk-dumping just the large chunks of swapped memory:

unswap(){ (awk -F'[ \t-]+' '/^[a-f0-9]*-[a-f0-9]* /{recent="0x"$1" 0x"$2}/Swap:/&&$2>1000{print recent}' /proc/$1/smaps | while read astart aend; do gdb --batch --pid $1 -ex "dump memory /dev/null $astart $aend" &>/dev/null; done&)2>/dev/null;}; grep VmSwap /proc/*/status 2>/dev/null | sort -nk2 | tail -n20 | cut -d/ -f3 | while read line; do unswap $line; done;echo "Dumps Free(m)"; rcount=10; while [[ $rcount -gt 0 ]]; do rcount=$(ps fauxww | grep "dump memory" | grep -v grep | wc -l); echo "$rcount        $(free -m | awk '/Swap/{print $4}')"; sleep 1; done 

I have yet to determine if this method poses any risk to the health of the process or system, especially when looped over multiple processes concurrently. If anyone has insight into any potential effect this may have on the processes or system, please feel free to comment.

Best Answer

You can achieve the same result by using GDB's 'dump memory' command and have it write to /dev/null.

You just need to find the regions in /proc/$PID/smaps that need to be unswapped. example from /proc/$PID/smaps:

02205000-05222000 rw-p 00000000 00:00 0 
Size:              49268 kB
Rss:               15792 kB
Pss:                9854 kB
Shared_Clean:          0 kB
Shared_Dirty:      11876 kB
Private_Clean:         0 kB
Private_Dirty:      3916 kB
Referenced:          564 kB
Anonymous:         15792 kB
AnonHugePages:         0 kB
Swap:              33276 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

and then use --batch mode to execute the gdb command so you can use it in your function:

[root@nunya ~]# swapon -s ; gdb --batch --pid 33795 -ex "dump memory /dev/null 0x02205000 0x05222000" ;swapon -s
Filename                Type        Size    Used    Priority
/dev/sda2                               partition   7811068 7808096 -1

[Thread debugging using libthread_db enabled]

Filename                Type        Size    Used    Priority
/dev/sda2                               partition   7811068 7796012 -1