Multiple Swap Files – Purpose and Benefits

swap

During installation of most (if not all) distro's of linux, the hard drive is partitioned to include a swap partition by default.

It is possible to change this behavior with swapon -p priority

According to the man pages, the priority is:

PRIORITY
Each swap area has a priority, either high or low. The default priority is 
low. Within the low-priority areas, newer areas are even lower priority 
than older areas.
All priorities set with swapflags are high-priority, higher than default. 
They may have any non-negative value chosen by the caller. Higher numbers 
mean higher priority.

Swap pages are allocated from areas in priority order, highest priority 
first. For areas with different priorities, a higher-priority area is 
exhausted before using a lower-priority area. If two or more areas have the 
same priority, and it is the highest priority available, pages are 
allocated on a round-robin basis between them.

As of Linux 1.3.6, the kernel usually follows these rules, but there are 
exceptions

Why would you ever need more than one swap file?
Is it common practice for system administrators to configure more than one swap?

Best Answer

There are oh so many reasons to have multiple swap areas (they don't need to be files), even if you only have a single spindle.

20-20 hindsight: You deployed a machine with a single swap area, then eventually realised it's not enough. You can't redeploy the machine at will, but you can make another swap area (probably a file) until redoing the partition layout becomes an option.

Resizing or moving swap areas: You can't resize swap areas (as mentioned by Evan Teitelman). And you can't just swapoff, make a new swap area and then swapon again unless you have enough RAM: swapoff wants to move all the swapped out pages to RAM before letting go of the swap area. So you make a temporary swap area, swapoff the original, wait till all the pages have moved from the old swap area to the temporary one, resize the original swap partition, mkswap it, then swapon the resized one and swapoff the temporary one. The swapped pages are copied from the temporary swap area to the resized one, and you're done. If you're moving swap areas, you don't even need a temporary area. mkswap the new one, swapon it, then swapoff the old one and everything's moved.

Crazy fast swapping: modern disks employ zone bit recording. The first zone of the disk is the fastest. You may want to measure the disk, and create a partition covering exactly the first, fastest zone of the drive. This may be smaller than your intended swap size. So you add multiple partitions on several disks, using the same technique.

Crazy fast swapping, the sequel: alternatively, once you know where your disks' fastest zones are, you can make high priority swap areas in the first zone, lower priority swap areas in the second zone, etc. This way your swapping system automatically knows to load balance across all fast disk zones, prefer the faster zones, and use the slower zones as an overflow area when the need arises.

Symmetric load balancing: on a nicely built system with many spindles (like a server), I like to have multiple swap partitions occupying the beginning of every disk (to take advantage of zone bit recording). They all have identical priorities, so the kernel will load-balance the swap. One spindle may give you 100 MB/s, but swap across all spindles could give you a multiple of that. (naïvely speaking)

Bottleneck-aware load balancing: in practice, however, there are other bottlenecks in place. So, for instance, a 16 disk server may have four 6 Gbps SATA ports, each with a four-port multiplier and four disks sharing the bandwidth. If you know about this, you can organise your swap spaces so Disk 1 on Ports 1–4 have the highest priority, the second disks on ports 1–4 have the second highest priority, etc. This will load balance swapping but not overwhelm the port multipliers.

Swapping across devices with different performance: (as mentioned by Luke) if your system isn't a brand new server, and it's grown organically over the years, it may have block devices that are significantly faster than others. You'll want to swap to the fastest device first, then to the next fastest, etc.

Size considerations: (courtesy of David Kohen) maybe putting all your swap on one drive leaves a few gigs free on the drive (this sounds like a 2001 scenario, but there are plenty of old or embedded devices where this could be an issue). Split it across all drives, and on top of all the other benefits above, you get better disk space usage per drive. It's one thing to lose a couple of gigs per spindle, and another to lose 300 gigs from one disk.

Emergencies: you have exactly 96 hours to submit your PhD thesis, and your last experiment (the one that's likely to get you that Nobel prize as well as funky mixed-case letters after your name) is sucking memory at impressive rates. You're almost out of swap. You create a swap file with a priority less than the priority of your main swap device — the kernel will use it as overflow swap space. You could even install swapd to do this for you automatically, so you'll also have plenty of swap space for those huge emacs and LaTeX runs.

Swapping across different media: Linux can't swap to character devices, but there are lots of different media, physical and virtual: SSDs (note: you probably don't want to swap on SSDs), dozens of shockingly different types of spinning hard disks, floppies (yes, you can swap on a floppy — you can always shoot yourself in the foot with Unix), DRBD volumes, iSCSI, LVM volumes, LUKS encrypted partitions, etc (including surreal, mind-boggling layered combinations of these — swap on LUKS on LVM on a parallel port ZIP drive over iSCSI over IEEE802.3ad aggregated Ethernet? No problem, you filthy pervert). These are niche scenarios, and are meant to support niche requirements.

Related Solutions

Linux – How to tweak the kernel for total swap out

You can set the value of /proc/sys/vm/swappiness to control the ratio of segments of data swapped to the segments of data kept in memory. A value of 0 completely avoids swapping at all costs.

This can be done using either:

echo 0 > /proc/sys/vm/swappiness
sysctl -w vm.swappiness=0
Storing that setting in /etc/sysctl.conf

Generally, using just a little swap is not a bad thing. Free memory can be used for caching data read from disk, and the system can plan ahead for a sudden need of lots of memory by an application.

When too many programs are swapped however, there is a lot of disk related activity during every program switch which really makes everything slow down. Before something can be used, it needs to be loaded back into memory.

Disks reads are horribly slow compared to memory access, as it takes significantly longer for the data to arrive. The system has to schedule the read between the other read/write requests, the drive starts making attempts to find the right cylinder, and finally starts slowly delivering data.

Hence, I think your logic is flawed. Generally, you want to keep programs running in memory, while still keeping enough room for sudden growth. Do not use the swap too often to "write things to disk", because it is neither a backup nor a performance improvement.

Older computers contained less memory and suffered from swapping problems as a result. When many programs were open at once, the system would slow down and you could hear the disk reading and writing in order to the swap file.

CentOS – Pull All Process’s Swapped Memory Out of Swap

You can achieve the same result by using GDB's 'dump memory' command and have it write to /dev/null.

You just need to find the regions in /proc/$PID/smaps that need to be unswapped. example from /proc/$PID/smaps:

02205000-05222000 rw-p 00000000 00:00 0 
Size:              49268 kB
Rss:               15792 kB
Pss:                9854 kB
Shared_Clean:          0 kB
Shared_Dirty:      11876 kB
Private_Clean:         0 kB
Private_Dirty:      3916 kB
Referenced:          564 kB
Anonymous:         15792 kB
AnonHugePages:         0 kB
Swap:              33276 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB

and then use --batch mode to execute the gdb command so you can use it in your function:

[root@nunya ~]# swapon -s ; gdb --batch --pid 33795 -ex "dump memory /dev/null 0x02205000 0x05222000" ;swapon -s
Filename                Type        Size    Used    Priority
/dev/sda2                               partition   7811068 7808096 -1

[Thread debugging using libthread_db enabled]

Filename                Type        Size    Used    Priority
/dev/sda2                               partition   7811068 7796012 -1

Best Answer

Related Solutions

Linux – How to tweak the kernel for total swap out

CentOS – Pull All Process’s Swapped Memory Out of Swap

Related Question