Kernel – Why is kworker Consuming Many Resources on Linux 3.0.0-12-server?

cpukernelloadperformance

Last Friday I upgraded my Ubuntu server to 11.10, which now runs with a 3.0.0-12-server kernel. Since then the overall performance has dropped dramatically. Before the upgrade the system load was about 0.3, but currently it is at 22-30 on an 8 core CPU system with 16GB of RAM (10GB free, no swap used).

I was going to blame the BTRFS file system driver and the underlaying MD array, because [md1_raid1] and [btrfs-transacti] consumed a lot of resources. But all the [kworker/*:*] consume a lot more.

sar has outputted something similar to this constantly since Friday:

11:25:01        CPU     %user     %nice   %system   %iowait    %steal     %idle 
11:35:01        all      1,55      0,00     70,98      8,99      0,00     18,48 
11:45:01        all      1,51      0,00     68,29     10,67      0,00     19,53 
11:55:01        all      1,40      0,00     65,52     13,53      0,00     19,55 
12:05:01        all      0,95      0,00     66,23     10,73      0,00     22,10 

And iostat confirms a very poor write rate:

sda             129,26      3059,12       614,31  258226022   51855269          
sdb              98,78        24,28      3495,05    2049471  295023077          
md1             191,96       202,63       611,95   17104003   51656068          
md0               0,01         0,02         0,00       1980        109          

The question is: How can I track down why the kworker threads consume so many resources (and which one)? Or better: Is this a known issue with the 3.0 kernel, and can I tweak it with kernel parameters?

Edit:

I updated the Kernel to the brand new version 3.1 as recommended by the BTRFS developers. But unfortunately this didn't change anything.

Best Answer

I found this thread on lkml that answers your question a little. (It seems even Linus himself was puzzled as to how to find out the origin of those threads.)

Basically, there are two ways of doing this:

$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
(wait a few secs)

For this you will need ftrace to be compiled in your kernel, and to enable it with:

mount -t debugfs nodev /sys/kernel/debug

More information on the function tracer facilities of Linux is available in the ftrace.txt documentation.

This will output what threads are all doing, and is useful for tracing multiple small jobs.

cat /proc/THE_OFFENDING_KWORKER/stack

This will output the stack of a single thread doing a lot of work. It may allow you to find out what caused this specific thread to hog the CPU (for example). THE_OFFENDING_KWORKER is the pid of the kworker in the process list.

Related Question