Linux – Is “writeback throttling” a solution to the “USB-stick stall problem”

cachelinux

The pernicious USB-stick stall problem – LWN.net, 2013.

Plug a slow storage device (a USB stick, say, or a media player) into a Linux machine and write a lot of data to it. The entire system proceeds to just hang, possibly for minutes.

The article predicted a simple change to the kernel defaults. On 64-bit x86, the writeback cache was allowed to grow to 20% of system RAM by default. Linus suggested to effectively limit it to ~180MB on all platforms, mimicking a limitation of the 32-bit x86 code. However, current Linux (v4.18) does not include the suggested change. (Compare Linus's patch, to the current function in v4.18).

The 2013 LWN article says the problem is "a storage equivalent to the bufferbloat problem". Now there is a 2016 LWN article on a new Linux feature, called writeback throttling (wbt / CONFIG_WBT / wbt_lat_usec). The article describes writeback throttling as a way to mitigate "a bufferbloat problem that mirrors the issues that have been seen in the networking stack". This is sounding very similar :-).

Has the specific USB-stick stall problem now been solved?

Toward less-annoying background writeback – LWN.net, 2016

It's an experience many of us have had: write a bunch of data to a relatively slow block device, then try to get some other work done. In many cases, the system will slow to a crawl or even appear to freeze for a while; things do not recover until the bulk of the data has been written to the device. On a system with a lot of memory and a slow I/O device, getting things back to a workable state can take a long time, sometimes measured in minutes. Linux users are understandably unimpressed by this behavior pattern, but it has been stubbornly present for a long time. Now, perhaps, a new patch set will improve the situation.


Inspired by this question: System lags when doing large R/W operations on external disks

Best Answer

The problem is the "USB-stick stall" article provides no evidence for its claim. There have been genuine "USB-stick stall" problems, and there continue to be some similar reports. However the thread discussed by the LWN article is not one of them! Therefore we cannot cite the article as an example. Additionally, any explanations it gives must be flawed, or at least incomplete.

Why were "USB-stick stall" problems reported in 2013? Why wasn't this problem solved by the existing "No-I/O dirty throttling" code?

To summarize the linked answer:

The problem reported to linux-kernel did not see the entire system hang, while it was flushing cached writes to a USB stick. The initial report by Artem simply complained that Linux allowed a very large amount of cached writes on a slow device, which could take up to "dozens of minutes" before they finished.

As you say, Linus' suggested "fix" has not been applied. Current kernel versions (v4.20 and below) still allow systems with large RAM to build up large amounts writes in the page cache, which can take a long time to write out.

The kernel already had some code designed to avoid "USB-stick stalls". This is the "No-I/O dirty throttling" code. This code was also described on LWN, in 2011. It throttles write() calls to control both the size of the overall writeback cache, and the proportion of writeback cache used for the specific backing device. This is a complex engineered system, which has been tweaked over time. I am sure it will have some limitations. So far I am not able to quantify any limitation. There have also been various bugfixes outside the dirty throttling code, for issues which prevented it from being able to work.

WBT limits the number of submitted IO requests for each individual device. It does not limit the writeback cache, i.e. the dirty page cache.

Artem posted a followup report that writing 10GB to a server's internal disk caused the system to hang, or at least suffer extremely long delays in responding. That is consistent with the problem that WBT aims to address.


Sidenotes kept from previous versions of this answer:

The scenario described for WBT is when you are writing a large batch of data to your main disk, and at the same time you want to keep using your main disk interactively, to load programs etc.

In contrast, when people talk about a "USB-stick stall" problem, they mean writing a large batch of data to a different disk / external USB etc, and then suffering surprising delays in programs that have nothing to do with that disk. Example:

"Even things as simple as moving windows around could stutter... It wasn't CPU load, because ssh sessions to remote machines were perfectly responsive; instead it seemed that anything that might vaguely come near doing filesystem IO was extensively delayed."

The 2013 mailing list thread about the USB stick problem, mentioned per-device limits on dirty page cache as a possibility for future work.

WBT does not work with the CFQ or BFQ IO schedulers. Debian and Fedora use CFQ by default, so WBT will not help for USB sticks (nor spinning hard drives) unless you have some special configuration.

Traditionally CFQ has been used to work well with spinning hard drives. I'm not entirely sure where this leaves WBT. Maybe the main advantage of WBT is for SSDs, which are faster than spinning hard drives, but too slow to treat like RAM?

Or maybe it's an argument to use the deadline scheduler instead, and forgo the CFQ features. Ubuntu switched to deadline in version 14.04, but switched back to CFQ since version 17.04 (zesty). (I think CentOS 7.0 is too old to have WBT, but it claims to use CFQ for SATA drives, and deadline for all other drives. CentOS 7.0 also supports NVMe drives, but only shows "none" for their scheduler.)