Linux – Why does high disk I/O reduce system responsiveness/performance

iokernellinuxperformance

I never quite understood why high disk I/O slowed the system so much. It's strange to me because I would expect the slow-down to affect only those processes dependent on the hard/optical drive data, but the slow-down affects even stuff loaded onto RAM. I'm here referring to iowait.

Why does the processor wait, instead of doing other work? Can anyone explain this limitation and why it hasn't been solved in Linux kernel? Is there a kernel out there that doesn't have this problem?

[note] There has been some progress in this performance area. For one, the later kernels (2.6.37 in my case) are much more responsive.

Best Answer

Operating systems make use of virtual memory so that more memory can be used than there is physical RAM available. When the kernel decides that it has a better use for a physical memory page, its content may be "paged out" for storage on disk. When such a virtual memory page is accessed while paged out, it generates a page fault and is moved back from the disk to RAM.

Page faults are a disaster for performance because disk latency is measured in milliseconds, while RAM latency is measured in nanoseconds. (1 millisecond = a million nanoseconds!)

Memory is not only used by user processes, but also by the kernel for things like file system caching. During file system activity, the kernel will cache recently used data. The assumption is that there is a good chance that the same data will be used again shortly, so caching should improve I/O performance.

Physical memory being used for the file system cache cannot be used for processes, so during file system activity more process memory will be paged out and page faults will increase. Also, less disk I/O bandwidth is available for moving memory pages from and to the disk. As a result processes may stall.

Related Question