Linux – Dynamically disabling cores in a power-efficient way

linuxmulti-corepower-management

I am looking for a mechanism to dynamically disable cores in Linux in order to minimize power consumption.

Unfortunately, disabling cores using the following simple approach actually increases power, based on readings from a Watts Up? Pro measuring total system power:

echo 0 > /sys/devices/system/cpu/cpu7/online

My experience seems to be confirmed by others (although this bug has been marked "CLOSED PATCH_ALREADY_AVAILABLE"):
https://bugzilla.kernel.org/show_bug.cgi?id=5471

Since the machine is unloaded, I want all but one of the cores (or perhaps two "cores", since the CPU is hyper-threaded) to be in the deepest possible sleep state. This does not seem to be happening on its own, based on the output of acpitool:

Processor ID           : 7
Bus mastering control  : no
Power management       : yes
Throttling control     : no
Limit interface        : no
Active C-state         : C0
C-states (incl. C0)    : 3
Usage of state C1      : 899 (99.3 %)
Usage of state C2      : 6 (0.7 %)

BTW, one point of confusion for me is that acpitool and /proc/acpi seem to disagree about the available C-states, or perhaps they use different naming schemes.

$ cat /proc/acpi/processor/CPU7/power 
active state:            C0
max_cstate:              C8
maximum allowed latency: 2000000000 usec
states:
    C1:                  type[C1] promotion[--] demotion[--] latency[001] usage[00000000] duration[00000000000000000000]
    C2:                  type[C2] promotion[--] demotion[--] latency[017] usage[00001248] duration[00000000001877531423]
    C3:                  type[C3] promotion[--] demotion[--] latency[017] usage[00000006] duration[00000000000012580727]

This seems to indicate that there are 4 C-states (C0-C3), but acpitool only reports 3 C-states.


Really this boils down to two questions:

  1. Is there a (safe) way to force individual cores into a specific sleep state (C-state), and force them to remain there until I explicitly wake them up?
  2. Alternatively, how can I improve the ability of the OS to automatically put cores into deeper sleep states more consistently?

Note that the latency of waking up from deeper sleep states is not a concern. FWIW, I am running Ubuntu 10.04.3 (kernel 2.6.32-38) on an Intel i7 920.

Best Answer

The methods to limit C-states above will all be permanent (until the system is rebooted). If you would like to have a system have extremely low latency during certain hours, but want more power savings at other times, there is a method to dynamically control which C-states are used.

To dynamically control C-states, open the file /dev/cpu_dma_latency and write the maximum allowable latency to it. This will prevent C-states with transition latencies higher than the specified value from being used, as long as the file /dev/cpu_dma_latency is kept open. Writing a maximum allowable latency of 0 will keep the processors in C0 (like using kernel parameter “idle=poll”), and writing a low value (usually 5 or lower) should force the processors to C1 when idle. The exact value needed to restrict processors to the C1 state depends on various factors such as which idle driver you are using, which CPUs you are using, and possibly the ACPI tables in your system. Higher values could also be written to restrict the use of C-states with latency greater than the value written. The value used should correspond to the latency values in /sys/devices/system/cpu/cpuX/cpuidle/stateY/latency (where X is the CPU number and Y is the idle state)—CPU idle states that have a greater latency than written to /dev/cpu_dma_latency should not be used.

One simple way to do this is by compiling a simple program that will write to this file, and stay open until it is killed. An example of such a program is below, and can be compiled by cutting and pasting the code into a file called setcpulatency.c, and running “make setcpulatency”. So, to minimize latency during certain hours, say from 8AM until 5PM, a cron job could be set up to run at 8AM. This cron job could run setcpulatency in the background with an argument of 0, with a cron table entry like this:

00 08 * * * /path/to/setcpulatency 0 &

Then, at 5PM, another cron job could kill any program that’s holding /dev/cpu_dma_latency open:

00 17 * * * kill -9 `lsof –t /dev/cpu_dma_latency`

Of course, this is just an example to show how C-states can be dynamically controlled... the crond service is often disabled in low latency environments, but these steps could be taken manually or run by other means.

#include <stdio.h>
#include <fcntl.h>

int main(int argc, char **argv) {
   int32_t l;
   int fd;

   if (argc != 2) {
      fprintf(stderr, "Usage: %s <latency in us>\n", argv[0]);
      return 2;
   }

   l = atoi(argv[1]);
   printf("setting latency to %d us\n", l);

   fd = open("/dev/cpu_dma_latency", O_WRONLY);

   if (fd < 0) {
      perror("open /dev/cpu_dma_latency");
      return 1;
   }

   if (write(fd, &l, sizeof(l)) != sizeof(l)) {
      perror("write to /dev/cpu_dma_latency");
      return 1;
   }
}