Linux VM Freeze – Fix Frequent Crashes on Proxmox VE

crashdockerfreezelinuxproxmox

I have Proxmox VE installed on a small form factor PC (NUC-like), with the following specifications:

  • CPU: Intel Pentium Silver N6005
  • RAM: 16GiB (2 x Lexar 8GiB) DDR4-3200MHz
  • Boot drive: SAMSUNG SSD 830 256GB
  • Additional drive: SAMSUNG 980 500GB NVMe SSD

I am using the system as a home lab, where I have installed the following:

  • 2 Linux Containers (LXC) hosting Pi-hole and PiVPN
  • 1 virtual machine hosting pfSense
  • 4 (2 Ubuntu 22.04 and 2 Debian 11) virtual machines hosting different applications using Docker; for example, Traefik, Duck DNS client, Jellyfin, Home Assistant, and a couple of personal websites.

The Linux Containers and the pfSense VM are rock-solid; they do not crash or freeze. The other 4 virtual machines are not very dependable; they freeze and hang at what seems random times, and I have to manually restart them using Proxmox VE "Hard Reset" button.

This happens from a couple of times a day to a couple of times a week. I was not able to find a trend. The only thing in common between these machines is Docker. However, I have installed Docker before on a virtual machine (Ubuntu 20.04 on Hyper-V) without any trouble.

To clarify:

  • When I say "freeze", I mean I cannot log into the machine using SSH or the machine's Console in Proxmox VE. The Console is not responsive and I cannot type or interact with it in any way.
  • When I say "crash", I mean I see a wall of text when I open the machine's Console in Proxmox VE. The text looks gibberish to me. Also, I cannot type or interact with Console in any way.

I do not know enough about troubleshooting Linux to know what is going on. Hence, I am here asking for any help that can point me in the right direction to figuring out what is happening.

Best Answer

I have exactly the same problem as you. I had a home lab with a Ryzen 5600x, and I never had any crashes. However, six months ago, I moved everything to a NUC 11 Essential with a Celeron N4505, and the VMs with Docker started to crash very often. After searching online, I only found one other post (link here) with the same problem, and it seems that the kernel has an issue with these CPUs (he has the same processor as yours). It's suggested that a newer kernel should work better with it. I installed the edge kernel 6.1 and then 6.2 on Proxmox, and the crashes reduced to once a week or every two weeks instead of every day. Two weeks ago, I updated Proxmox to version 8, and it was looking much better until this morning when the VM with Firezone and Docker crashed again. I really cannot find any solution, and I think I will move back to a Ryzen CPU.

Related Question