Login Loop – After Upgrading to 4.4.0-116 Kernel on Ubuntu 16.04

16.04graphicskernelnvidiaupgrade

Can't login into desktop environment after apt upgrade && reboot: on entering password the screen flickers into a black screen and returns back to the login screen. Login via the terminal (Ctrl+Alt+F1) works fine.

/var/log/Xorg.0.log says:

(EE) NVIDIA: Failed to initialize the NVIDIA kernel module. Please see the
(EE) NVIDIA: system's kernel log for additional error messages and
(EE) NVIDIA: consult the NVIDIA README for details.
(EE) No devices detected.

$ dmesg says:

nvidia: version magic '4.4.0-116-generic SMP mod_unload modversions ' should be '4.4.0-116-generic SMP mod_unload modversions retpoline '

Trying to load the nvidia driver manually fails:

$ sudo modprobe nvidia
modprobe: ERROR: could not insert 'nvidia': Exec format error

Related: VirtualBox not starting after kernel upgrade

Best Answer

The issue is with gcc version that doesn't support retpoline (What is a retpoline and how does it work?). See Ubuntu bug: 4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and 16.04).

In my case, purging ppa:ubuntu-toolchain-r/test to install the default gcc version and rebuilding with DKMS the nvidia module (by reinstalling 4.4.0-116 kernel) fixes the problem. See instructions posted by @cjjefcoat on the bug tracker.

Restore default gcc by purging ppa:ubuntu-toolchain-r/test's version:

$ sudo apt-get install ppa-purge
$ sudo ppa-purge ppa:ubuntu-toolchain-r/test

gcc version (on Ubuntu 16.04) with retpoline support:

$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609

Reinstall kernel:

$ sudo apt-get purge linux-headers-4.4.0-116 linux-headers-4.4.0-116-generic linux-image-4.4.0-116-generic linux-image-extra-4.4.0-116-generic linux-signed-image-4.4.0-116-generic
$ sudo apt-get install linux-generic linux-signed-generic

Check nvidia module:

$ modinfo nvidia_xxx -k 4.4.0-116-generic | grep vermagic
vermagic:       4.4.0-116-generic SMP mod_unload modversions retpoline 

replace _xxx with your version -- just press TAB after modinfo nvidia

retpoline should be in the output.

After that reboot completed successfully.


If you have compatible gcc version already, you could rebuild nvidia module using dkms command without reinstalling the kernel:

# dkms remove nvidia-xxx/yyy.zzz -k 4.4.0-116-generic
# dkms install nvidia-xxx/yyy.zzz -k 4.4.0-116-generic

I've decided to reinstall the kernel instead to update all modules that were re-built with DKMS using a wrong gcc version.

Related Question