Ubuntu – Kernel 4.18.0.11.12 trouble: Video black screen

driversgrub2kernelnvidiavideo

I've had a comedy of video errors that all trace back to a kernel upgrade in Ubuntu 18.10 cosmic. The kernel 4.18.0.11.12 causes trouble, well, in every conceivable way and I can't quite understand why it works for anybody.

In case you have "black screen of death" at various phases, my suggestion is to resist advice to fiddle a lot with config for video or display manager. Instead, boot with older kernel to see if problems disappear. Hardest part for most users will be figuring how to make Ubuntu show a grub menu to choose a kernel (they've made that tricky, but have instructions: https://wiki.ubuntu.com/RecoveryMode).

The symptoms of the problem revolve around

  1. Black screen of death with message about rejection of PKS keys
  2. Unable to start display manager (gdm3 or lightdm)
  3. External monitors not recognized (by desktop programs or xrandr)
  4. Black screen of death on resume from suspend.

This is a Dell Precision 5510 laptop with Nvidia and Intel graphics:

    *-display
            description: 3D controller
            product: GM107GLM [Quadro M1000M]
            vendor: NVIDIA Corporation
            physical id: 0
            bus info: pci@0000:01:00.0
            version: a2
            width: 64 bits
            clock: 33MHz
            capabilities: bus_master cap_list rom
            configuration: driver=nouveau latency=0
            resources: irq:125 memory:dc000000-dcffffff 
  memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:e000

   *-display
         description: VGA compatible controller
         product: HD Graphics 530
         vendor: Intel Corporation
         physical id: 2
         bus info: pci@0000:00:02.0
         version: 06
         width: 64 bits
         clock: 33MHz
         capabilities: vga_controller bus_master cap_list rom
         configuration: driver=i915 latency=0
         resources: irq:126 memory:db000000-dbffffff  
memory:70000000-7fffffff ioport:f000(size=64) memory:c0000-dffff

I did not realize that the kernel update was at the heart of my trouble, I chased a lot of symptoms that pointed at the use of either gdm3 or lightdm as the display manager, then the Nvidia drivers, then modesetting, but in the end I conclude this particular kernel causes problems that I'm not able to solve. I'm back running the 4.18.10

The first symptom I saw was a failure to start. After the grub phase, I had black screen freeze-up with the message

PKCS#7 signature not signed with a trusted key

The system did not respond to Alt-Ctl-F2 or such, no VT was possible. Comments in this forum pointed the finger at the Nvidia drivers. In retrospect, this may be a gdm3 flaw, rather than Nvidia (Ubuntu 18.04 Boot hangs at PKCS#7 signature not signed with a trusted key, Ubutnu 18.04 – after upgrade – Display/PKCS#7 signature error). I never solved it.

I used the recovery login to clear out the Nvidia drivers and move xorg.conf out of the way. I'd run with the intel drivers.

After that, when gdm3 was the display manager, I came to black screen with just a single "_" showing top left on the screen. One suggestion was that gdm3 was trying to launch a Wayland session. I tried the fix to disable Wayland (gdm3 does not start in ubuntu 18.04), no help. Those posts suggest there is a way to make gdm3 work, but more posts recommend using lightdm instead.

So I changed display manager to lightdm. However, even after that, I had black screen of death after grub and I found advice about putting settings in the grub setup for nomodeset or enforce restrictions on nouveau. After a lot of fiddling around, the system would reach the login window. However, I saw 2 problems after that. The video would not resume after suspend (but I could log in with SSH) and external monitors were not detected (probably because of all the no mode setting steps I'd taken).

Not realizing that the mode settings were probably causing the external monitors to be ignored, I re-installed the nvidia drivers (hoping somehow the PKCS key issue would solve itself). That resulted in a black screen of death at startup, but Alt-Ctl-F2 did allow a VT so I could look at dmesg.

After a lot of restarts, I finally decided to try the older kernel, 4.18.0-10, and after that lightdm would start. Suspend would work. I had to remove all of the nomodesetting things I'd put into grub config (and re-run update-grub) in order to get external monitors to work. Also, in the nvidia file /lib/modprobe.d/nvidia-kms.conf, it had to be put back

# This file was generated by nvidia-prime
# Set value to 0 to disable modesetting
options nvidia-drm modeset=1

and initramfs had to be executed.

I have found many sites and posts about video problems and they may offer helpful advice, but none of them helped me with kernel 4.18.0.11.12. I spent a couple of days chasing those fixes, but wish I had just tested the older kernel first.

My suggestion if that if an update in the kernel occurs, and you start to have black screen or other video problems, be a bit patient with trying to bugshoot those things in the usual way. Run the older kernel to see if it works, and if it does, use it and be happy until the new kernel's problems get worked by the experts.

The only "video fix" that I've learned is–absolutely for sure– valuable is to remove 2 lines from /etc/environment. I don't know for sure how these lines were inserted in there, but it happened in a previous version of Ubuntu (either by a package like gnome-wobbly-windows or by my manual effort to fix screen tearing in 2017). These last 2 lines in /etc/environment about CLUTTER need to be commented out (or deleted entirely):

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
## CLUTTER_PAINT=disable-clipped-redraws:disable-culling
## CLUTTER_VBLANK=True

If you don't make that change, then video inside Ubuntu Gnome desktop with nvidia drivers is fragmented and choppy. Those 2 lines were, so far as I can tell, inserted by some special-effects packages in older Ubuntu (maybe 17.10) and they were causing lots of trouble for me only with Gnome, but not with XFCE4. Because the problem appeared only in Gnome, I knew it was not an Nvidia problem. (Graphics problem on Ubuntu 18.04 – blurred text + screen flickering)

About troubleshooting with the new kernel: I'm willing to try again, but I want somebody to give advice about whether it is useful to even try and what is the best way to go about it.

Best Answer

I have one avenue to fix the gdm "black screen" problem. Please see the post on the Nvidia Linux forum: https://devtalk.nvidia.com/default/topic/1044730/linux/x-displays-in-a-small-section-of-screen-something-in-my-grub-setup-wrong-

I made a number of changes before we understood the fundamental problem. The gdm package, at some point, created the user gdm with an incorrect user number and it created the user's home folder in /home/gdm. In order to fix this, I had to force remove the gdm package, then manually delete the gdm user from /etc/passwd and /etc/group. Then upon re-install, I got a new gdm user with a uid below 1000 and no new folder in /home/gdm. That change, by itself, may correct the problem. It is for sure causing the black screen of death if your user id (in /etc/passwd) is greater than 1000.

However, I made other changes. With the newest Ubuntu kernel, I am sure that the following boot elements are needed to prevent the black screen of death upon resume from suspend:

That details the steps I took. I have a block in the /etc/default/grub file like so:

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=menu
## GRUB_TIMEOUT=0
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="nosplash"
GRUB_CMDLINE_LINUX="nouveau.blacklist=1 acpi_rev_override=1 acpi_osi=Linux  nouveau.modeset=0 pcie_aspm=force drm.vblankoffdelay=1 scsi_mod.use_blk_mq=1 nouveau.runpm=0 mem_sleep_default=deep"

There is a kernel bug that necessitates this (according to the Nvidia forum).

Second, in /etc/default/, it had been suggested to set

[daemon]
WaylandEnable=false

This was needed to encourage gdm to use X11 rather than Wayland. However, after testing today, I find gdm3 starts whether or not I have that setting.

Third, I'm running nvidia-driver-410. I believe this will also work with nvidia-driver-390 or nvidia-driver-415. However, after the gdm fix worked, I stopped plugging in alternative drivers.

The nvidia drivers are installed from the PPA:

$  cat graphics-drivers-ubuntu-ppa-cosmic.list
deb http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu cosmic main
# deb-src http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu cosmic main

EDIT 2018-12-07: I forgot to mention this additional change:

in /lib/modprobe/nvidia-kms.conf, I turned off kms mode setting:

$ cat /lib/modprobe.d/nvidia-kms.conf
# This file was generated by nvidia-prime
# Set value to 0 to disable modesetting
options nvidia-drm modeset=0

I believe that was an important step in dealing with the problem that the laptop screen was not "filled up" by the X11 display.

Related Question