Because of the nature of a tar
archive which sequentially stores the files in the output, there is no way to parallelize the process unless you make more than one archive.
Note that the bottleneck of the operation would likely be the hard drive. For that reason, even if you did split the task in two or more processes, it would not go faster unless they operate on different drives.
This answer is mostly speculative, because I know nothing about power management on Intel processors and I haven't looked at the Linux code, but I think it's plausible.
I think derobert's explanation about power management is what's going on. Power management is a compromise between power consumption and performance. When the processor isn't being used at 100% of its peak performance, it's beneficial to reduce its frequency, which makes it slower but cooler.
Linux varies the CPU frequency over time. How it does so is controlled by policies called governors. The general idea is that when the system hasn't used the CPU at its full performance for a while, it reduces the CPU frequency. Conversely, if the CPU is continuously busy for a while, the kernel increases the frequency.
Seeing intel_idle
scheduled means that the core isn't executing code, but in fact in a “suspended” mode where it consumes little power. This brings bigger power savings than merely reducing the frequency, but at a higher cost: although the CPU wakes back up when an interrupt occurs, this takes some time (tens of microseconds? more?).
It's perfectly normal to see intel_idle
when you aren't fully utilizing all your cores. This saves a lot of power (both for the processor itself and for cooling devices) compared with having the CPU at full speed all the time. The only reason you might have to disable this mechanism is if you need very low latency. If you run a CPU-intensive application, you'll see less and less intel_idle
. Making use the CPU idle mode doesn't cost performance except during the transition period where the kernel hasn't fully decided yet that the system needs a lot of CPU power.
If you fully saturate your cores, you'll reach 0% intel_idle
. Note that it might be difficult to saturate all the cores (though a specially-designed benchmark can do it) since all the code and data that is being executed doesn't fit in the CPU cache, the limiting factor will be the RAM access speed. “All the code and data” includes everything that's running on the machine, including your user interface; in practice saturating all the cores is rare.
Best Answer
You see a high system load because tar spends a lot of time waiting for I/O. You see a low CPU usage because tar uses very little CPU time: it's mostly just copying some bytes when the disk delivers them. Linux includes time waiting for I/O in the load average (unlike many other Unix variants), but not in a process's CPU time. (Source: https://linuxtechsupport.blogspot.com/2008/10/what-exactly-is-load-average.html via Wikipedia)
There's nothing to be worried about. You asked the computer to do an I/O-bound operation and it's busy doing some I/O. Business as expected.