Yesterday I was trying to compile the ROOT package from source. Since I was compiling it on a 6 core monster machine, I decided to go ahead and build using multiple cores using make -j 6
. The compiling went smooth and really fast at first, but at some point make
hung using 100% CPU on just one core.
I did some googling and found this post on the ROOT message boards. Since I built this computer myself, I was worried that I hadn't properly applied the heatsink and the CPU was overheating or something. Unfortunately, I don't have a fridge here at work that I can stick it in. 😉
I installed the lm-sensors
package and ran make -j 6
again, this time monitoring the CPU temperature. Although it got high (close to 60 C), it never went past the high or critical temperature.
I tried running make -j 4
but again make
hung sometime during the compile, this time at a different spot.
In the end, I compiled just running make
and it worked fine. My question is: Why was it hanging? Due to the fact that it stopped at two different spots, I would guess it was due to some sort of race condition, but I would think make
should be clever enough to get everything in the right order since it offers the -j
option.
Best Answer
I don't have an answer to this precise issue, but I can try to give you a hint of what may be happening: Missing dependencies in Makefiles.
Example:
If you call
make target
everything will compile correctly. Compilation ofa.source
is performed (arbitrarily, but deterministically) first. Then compilation ofb.source
is performed.But if you
make -j2 target
bothcompile
commands will be run in parallel. And you'll actually notice that your Makefile's dependencies are broken. The second compile assumesa.bytecode
is already compiled, but it does not appear in dependencies. So an error is likely to happen. The correct dependency line forb.bytecode
should be:To come back to your problem, if you are not lucky it's possible that a command hang in a 100% CPU loop, because of a missing dependency. That's probably what is happening here, the missing dependency couldn't be revealed by a sequential build, but it has been revealed by your parallel build.