Second video card has no output

nvidiaopensusexorg

I am trying to pin down the reason for GPU passthrough not working with my NVidia GTX 750 Ti device. Therefore I am trying to start a second X instance with that video card – the main one is a GTX 1070.

I am connecting them to the same monitor – the 1070 via DisplayPort, the 750Ti via HDMI.

Booting Windows results in both being detected and activated and I get output via both DisplayPort and HDMI.

When booting using Linux the 1070 works without issue. The 750Ti is detected:

# nvidia-smi -L
GPU 0: GeForce GTX 1070 (UUID: GPU-a66c5cbb-a541-a3d7-845c-f8c0c021ae71)
GPU 1: GeForce GTX 750 Ti (UUID: GPU-db546e26-f6d5-5345-45e4-434e0bfb4f62)

and in the nvidia-settings program it is shown as connected to the Monitor.

However, when starting up a second Xorg instance, I get no output on the HDMI port.

The command I use is

sudo Xorg :2 vt8 -config xorg-second.conf  -configdir conf.d

where conf.d is empty, to make sure no other settings are applied, and xorg-second.conf is pretty much standard except

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    BusID          "PCI:3:0:0"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "Coolbits" "4"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

to make sure that the right video card is picked up.

I have tried switching cables, but that did not help.

Why is the second video card not sending output to my monitor and how can I fix it?


Edit: Here's the output for xrandr for both X instances:

$ xrandr --display :1
Screen 0: minimum 8 x 8, current 2560 x 1440, maximum 32767 x 32767
DVI-D-0 disconnected (normal left inverted right x axis y axis)
HDMI-0 disconnected (normal left inverted right x axis y axis)
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)
DP-2 disconnected (normal left inverted right x axis y axis)
DP-3 disconnected (normal left inverted right x axis y axis)
DP-4 connected primary 2560x1440+0+0 (normal left inverted right x axis y axis) 553mm x 311mm
   2560x1440     59.95*+
   2048x1152     60.00  
   1920x1200     59.88  
   1920x1080     60.00    59.94    50.00    29.97    25.00    23.97    60.05    60.00    50.04  
   1680x1050     59.95  
   1600x1200     60.00  
   1280x1024     75.02    60.02  
   1280x720      60.00    59.94    50.00  
   1200x960      59.90  
   1152x864      75.00  
   1024x768      75.03    60.00  
   800x600       75.00    60.32  
   720x576       50.00  
   720x480       59.94  
   640x480       75.00    59.94    59.93  
DP-5 disconnected (normal left inverted right x axis y axis)

$ xrandr --display :2
Screen 0: minimum 8 x 8, current 2560 x 1440, maximum 16384 x 16384
DVI-I-0 disconnected primary (normal left inverted right x axis y axis)
DVI-I-1 disconnected (normal left inverted right x axis y axis)
HDMI-0 connected 2560x1440+0+0 (normal left inverted right x axis y axis) 553mm x 311mm
   2560x1440     59.95*+
   2048x1152     60.00  
   1920x1200     59.88  
   1920x1080     60.00    59.94    50.00    29.97    25.00    23.97    60.05    60.00    50.04  
   1680x1050     59.95  
   1600x1200     60.00  
   1280x1024     75.02    60.02  
   1280x720      60.00    59.94    50.00  
   1200x960      60.00  
   1152x864      75.00  
   1024x768      75.03    60.00  
   800x600       75.00    60.32  
   720x576       50.00  
   720x480       59.94  
   640x480       75.00    59.94    59.93  
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)

Interesting to note that when I run xrandr --display :2 the second time, it hangs. The final lines of the strace output are

socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X2"}, 20) = 0
getpeername(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X2"}, [124->20]) = 0
uname({sysname="Linux", nodename="mars", ...}) = 0
access("/run/user/1000/gdm/Xauthority", R_OK) = 0
open("/run/user/1000/gdm/Xauthority", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0700, st_size=96, ...}) = 0
read(4, "\1\0\0\4mars\0\0\0\22MIT-MAGIC-COOKIE-1\0\20"..., 4096) = 96
close(4)                                = 0
getsockname(3, {sa_family=AF_UNIX}, [124->2]) = 0
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
poll([{fd=3, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{iov_base="l\0\v\0\0\0\22\0\20\0\0\0", iov_len=12}, {iov_base="", iov_len=0}, {iov_base="MIT-MAGIC-COOKIE-1", iov_len=18}, {iov_base="\0\0", iov_len=2}, {iov_base="\36\271\266\234:\323(\237\35y\334(X\37\32\10", iov_len=16}, {iov_base="", iov_len=0}], 6) = 48
recvfrom(3, 0x18dd330, 8, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN}], 1, -1

Edit 2

$ xset -display :2 q ( just the part about DPMS )

DPMS (Energy Star):
  Standby: 600    Suspend: 600    Off: 600
  DPMS is Enabled
  Monitor is On

Xorg log – https://pastebin.com/fK7g5VSd

Best Answer

In the log, you can see that the server at :2 also detects the main graphics card GTX 1070 (GP104-A) at PCI:1:0:0 (GPU-1). This is something that doesn't happen in regular X drivers - if you tell the driver in an xorg.conf to only use PCI:3:0:0, then it will only use this card, and never see any other card.

So the only explanation I have is that because the closed-source Nvidia drivers use a different infrastructure (a unified kernel driver very similar for Windows and Linux), the Nvidia drivers are just not made to handle this kind of situation, or they handle it differently. As long as you use the closed-source drivers, it could very well be that a single instance of the kernel-driver is supposed to drive all available cards, and that's it. And nobody really tested using several X servers to connect to that single instance (after all, Nvidia even only provides the nvidia-specific "TwinView" for multiple screens). Not to speak of using one kernel-driver for one card in an VM, and another kernel-driver outside the VM.

And if you can't use the nouveau drivers, there's really no way around it.

You can try to use the Nvidia-specific options for 375.39, for example set ProbeAllGpus to FALSE for both servers. Maybe that helps, maybe it doesn't. Possibly MultiGPU helps, though I think this is meant for a different situation.

You can also try ConnectedMonitor or UseDisplayDevice to restrict the display somehow.

The way to test this theory would be to use two Nvidia cards that are also supported by Nouveau, and then see if one can make the Nouveau driver work in this way. Unfortunately, I don't have the hardware to do that.

Related Question