Copying large file causes excessive swap

cpkvmmemoryswap

On a mid-class CentOS 6.4/64 server with 32 GB RAM, 3 TB free disk space, operating as KVM hypervisor, I start copying a 200 GB file to a destination in the same local filesystem. In fact, this file is a KVM virtual disk image (corresponding to a shut down VM). Other 12 VMs are up & working normally on this same machine.

I start with plenty of headroom to go:

[root@myserver]$ free
             total       used       free     shared    buffers     cached
Mem:      32847956   16722708   16125248          0      63756     407740
-/+ buffers/cache:   16251212   16596744
Swap:     16383992          0   16383992

But as the copy progresses, memory usage starts to grow steadily until it hits swap. Of course, this slows down everything now… the copy finally ends after ~30 minutes.
At the end, my memory looks like:

[root@myserver]$ free
             total       used       free     shared    buffers     cached
Mem:      32847956   32643564     204392          0      24392   23213400
-/+ buffers/cache:    9405772   23442184
Swap:     16383992   12057880    4326112

Looking what processes are now using swap, I observe it several of the qemu-kvm instances. So now the performance of the server is suffering, as many if not all the VMs are now swapping.
I don't find a way to get swap back to zero (its otherwise normal condition) without having to bring this production server to a reboot.

What can cause this ?
How can a simple cp process eat that much memory and how can this be avoided ?
Any comments ?

Thanks

Best Answer

Getting swap back to 0 is not a useful goal.

There is nothing ipso facto wrong about having things in swap. It is quite possible for a program to load resources it doesn't actually use, and for the kernel to notice this and swap them out, freeing the memory for use by programs that can actually make use of it right now. This situation comes up a lot in today's modern bloatware world, where programs depend so much on huge libraries others provide, even though they need only a tiny fraction of the library's full capability.

The only hard numbers you have provided — 200 GB in 30 minutes — also looks quite good to me. That's 114 MByte/sec, which is an impressive copy rate, considering that you're copying a file within a single physical volume. It wasn't that long ago that 100 MByte/sec on purely sequential reads was pretty impressive. You're managing better than that with interleaved reads and writes!

Bottom line, I think you're barking up the wrong tree.

Related Solutions

Linux – real memory usage

The Mem: total figure is the total amount of RAM that can be used by applications. This is the total RAM installed on the system, minus:

memory reserved by hardware devices (often video memory if the graphics card doesn't have its own RAM);
memory used by the kernel itself.

That total includes:

free: memory that is currently used for any purpose;
shared: a concept that no longer exists. It's left in the output for backward compatibility (there are scripts that parse the output from free). (On current systems you'll typically see nonzero values because shared has been repurposed to show memory that's explicitly shared via a shared memory mechanism. On older systems, it included files mapped by more than one process and shareable memory that remained shared after fork().)
buffers: memory that is backed by files, and that can be written out to disk if needed;
cache: memory that is backed by files, and that can be reclaimed at any time (the difference with buffers is that buffers must be saved to disk before they're reused, whereas cache consists of things that can be reloaded from disk);
used -buffers/cache: memory used by applications (and not paged out to swap).

In a pinch, the system could run without buffers and cache, reserving RAM for applications and systematically performing disk reads and writes without any caching. The -/+ buffers/cache figures indicate the amount of RAM used directly by applications (used column) and the amount of RAM not used by applications (free column).

Although this can vary a lot, a healthy system typically has around half its RAM devoted to applications and half devoted to buffers and cache. Unless you're running a dedicated file server, your system has more RAM than it needs for what you're currently doing. If the free - buffers/cache figure was low, that would indicate a system that doesn't have enough RAM (contrary to a widespread belief, having a lot of memory devoted to buffers and cache is important for system performance, and trying to reserve more memory for applications would make 99.99% of systems slower).

The swap line is straightforward, it shows the amount of swap that's in use (either by applications or for tmpfs storage), and the amount that isn't.

Debian – Swap usage too high

You can't swapoff because the amount of swapped memory can't be overtaken by your RAM. You are getting legitimate error message.

Small snippet.

if (!quiet || errno == ENOMEM)
    warn(_("%s: swapoff failed"), orig_special);

return -1;

In my opinion, your workload increases your RAM demand. You are running a workload that requires more memory. Usage of the entire swap indicates that. Also, changing swappiness to 1 might not be a wise decision. Setting swappiness to 1 does not indicate that swapping will not be done. It just indicates how aggressive kernel will be in respect of swapping, it does not eliminate swapping. Swapping will happen if needs to be done.

Also, I don't know why you are trying to disable swap. Unless you have tons and tons of RAM, you should not disable swap.

Of course, you can reboot and swap usage will be zero then. And you can safely swapoff then. But, that doesn't solve the problem in long term.

Would you mind, posting the /proc/meminfo output.

Best Answer

Related Solutions

Linux – real memory usage

Debian – Swap usage too high

Related Question