Windows – Unexpected shutdown with heavy single threaded process and single core affinity

pythonshutdownthreadswindows

I'm not sure what's the cause of the problem, but it happened 3 times in two weeks under similar conditions.

I checked with the laptop's support desk and they made me run several tests to see if my machine was overheating but there aren't any signs of that. So, here is the problem:

I sometimes run some heavy CPU-bound programs in Python and, when I'm not using any multiprocessing stuff, I usually set the affinity to one of my 4 "cores" (Core i5 – 2 cores and 4 threads by SMT) and use the priority "High".

The first 2 times when it happened, the computer was running the heavy task for more than 24 hours when it unexpectedly turned off. I was browsing or doing other stuff and everything was gone. The machine was somewhat hotter than usual, but not really hot.

When I turned it on again, It behaves like nothing happened… Just "Windows is starting" or something like that. Not a single message about the fail!

The third time it happened, the process was set to "Real Time" after it was running during 5 minutes on Normal priority. The computer turned off about 5 seconds after finished running the task (that was a quick one).

It wasn't even hot!!

When I turned it on again and tried to reproduce the error, I couldn't. So maybe the laptop should stay on for at least one day or two before showing this problem…

Now some funny things:

  • It always happen when I'm also using it for other tasks
  • Programs like Blender rendering 3D images during 2 days in a row using the 4 threads doesn't crash it. Also tried with HandBrake converting videos on High Priority and using all threads.

Some thoughts:

  • I live in a very hot place (30°C to 40°C now that it's summer), but it also happened with the air conditioning system working.
  • When running on single thread, the Intel Turbo Boost System raises the active core frequency from 2.67GHz to 2.93GHz

So, what should I do? There are other tests I could run to see if there's a problem with the CPU? Should I discard overheating even if I don't feel the laptop getting much hotter?

Best Answer

Your problems are most probably caused by a dried-out thermal paste on your CPU and/or GPU. This will certainly happen on any laptop after a couple years, especially if running heavy tasks for extended periods of time, like in your case.

I have experienced exactly the same problem once every couple years on my Sony Vaio. Each time, it was solved with couple drops of thermal paste and a can of compressed air.

Buy new thermal paste, a can of compressed air for blowing the dust off all components (especially ventilation ports) and a bottle of petrol for cleaning the CPU/GPU from dried out remains of old thermal paste. You can google for details and there's a big chance you can find a complete disassembly video for your laptop model on YouTube. It shouldn't cost more than 20$, but you will end up with a laptop which runs much cooler, quieter, faster and without crashes.

Related Question