Linux – What page replacement algorithms are used in Linux kernel for OS file cache

linuxmemory

Linux uses the unused portions of memory for file caching, and it cleans up the space when needed.

My question is about how it picks a victim page for replacement?
There are various algorithms (LRU, FIFO, LFU and random replacement)

I'd like to know
1) What page replacement algorithms are used in Linux kernel for OS file cache?

2) If possible, I'd like to know how it have evolved over time in Linux kernel. I assume that its algorithm and implementation might change overtime considering "reasonable" changes in trends. How can I find those? Do I need to read kernel source codes?

Best Answer

Linux memory management ("MM") does seem to be a bit arcane and challenging to track down.

Linux literature makes heavy mention of LRU (Least Recently Used), in the context of memory management. I haven't noticed any of the other terms being mentioned.

I found an interesting introduction (first four paragraphs) in this article on the incomparable LWN.net. It explains how basic LRU can be implemented in practice for virtual memory. Read it.

True LFU (Least Frequently Used) replacement is not considered practical for virtual memory. The kernel can't count every single read of a page, when mmap() is used to access file cache pages - e.g. this is how most programs are loaded in memory. The performance overhead would be far too high.


To go beyond that simple concept, there is a design outline around Linux version 2.6.28-32 here:

http://linux-mm.org/PageReplacementDesign

It suggests Clock-PRO is used for file pages. There is an original paper available on it. There is an old description of Clock-PRO on LWN.net, again including some practical implementation details. Apparently Clock-PRO "attempts to move beyond the LRU approach", a variant of which is "used in most systems". It seems to put more weight on frequency.

Linux-mm has another design document for implementing Clock-PRO in Linux. This one doesn't talk about it being merged in; it was written a few months before the LWN article about it. http://linux-mm.org/ClockProApproximation

More recent descriptions are that Linux merely "uses some ideas" from Clock-PRO, and is actually a "mash of a number of different algorithms with a number of modifications for catching corner cases and various optimisations".

The above quote was answered a question by Adrian McMenamin. McMenamin went on to complete an MSc project in 2011, testing modifications to Linux page replacement based on the "working set model". It includes a brief description of Linux page replacement. The "variant of LRU" is named as "the 2Q [two-queue] approach for database management", a number of references are provided, and there is a diagram illustrating movement between the two queues and other state transitions. He also describes Linux as using a partial implementation of CLOCK-PRO.

I expect the LRU concept, as opposed to the other possibilities you mention, was established from the start. And the most significant change was the introduction of Clock-PRO based features, i.e. putting some more weight on frequency.

In 2013 Linux gained "thrash detection-based file cache sizing". This is probably also relevant to the question.