Linux – Computer hard freezes randomly – possible hardware issue

freezehardware-failurelinuxwindows

Symptoms:

  • When the computer is running, it randomly freezes and requires an hard reset. This happens both when using intensive applications (games, compiling) or just browsing the web.
  • The freezing happens on both of my operating systems: Win 8.1 x64 and Arch Linux x64.
  • One particular freeze on Win 8.1 happened while I was talking on TeamSpeak to a friend. After 1 minute I was able to continue speaking to my friend, even if the screen was frozen and nothing else seemed to work. The only thing I could do was literally speak to my friend through TeamSpeak. Alt-F4 wouldn't close TeamSpeak either. During the whole time the screen was frozen.
  • After rebooting, sometimes nothing happens (the PC is on, though). I sometimes need to turn off and on the PC multiple times to get anything to display on the screen (the screen stays off otherwise, as if there is no input arriving to it).

What I have tried:

  • Touching everything inside the case with the PC on. Tried touching RAM, GPU, SATA cables, etc. Nothing happened.
  • Checking temperatures during stress/non-stress periods. They seem fine.
  • Checking event logs/kernel logs. Nothing there.

Any idea what the problem could be? I suspect it is the GPU, because of the "TeamSpeak freeze". But honestly, I'm just shooting in the dark.

Info:

lspci -v, lshw

Best Answer

Since you have this problem on both OS, I'd recommend you to check all hardware step by step, especially motherboard. As for me, I would check it in following steps:

  1. South Bridge temperature. You can just touch it carefully. If it is hot enough to burn a bit or close to it, that's bad sign. (Thermal sensors could lie, your burnt finger won't)
  2. Memtest at least 4 passes. I had cases when 1st was OK and second or third shown errors.
  3. MHDD just in case of bad blocks, that could freeze any system if situated near system core files. But that's unlikely your case anyway.

Both 2 and 3 steps preferably should be run from bootable media rather than from OS.

If nothing reveals faulty element, then just try to change components one at time (if you have them enough to test; buying just to test isn't good idea as it would likely cost as whole new PC). Also AIDA64 software has stress-test component, I was using it to find some issues and could recommend it too.

And I would add I had such a problem long ago. After long research I found faulty Ethernet network card driver. In my other case I had laptop beeping on boot about RAM fault and it was solved by replacing HDD. Keep in mind that sometimes problem is hiding in totally different component than you thinking of.

P.S. I'm sorry for a puzzle-headed answer, your question is too generic to provide exact instruction.

Related Question