KVM GPU Passthrough – Group 15 IOMMU Issue

driversgpukvmnvidiavirtualization

I followed https://mathiashueber.com/windows-virtual-machine-gpu-passthrough-ubuntu/. However, there's just one thing I did not follow: I left noveau instead of the official driver, because if I do as it says, when I reboot I only see black screen. And also I want to use noveau on the host instead of a proprietary and possibly insecure driver.

I have a Ryzen 7 2700X on a Gigabyte B450m motherboard. I have a GTX 1060 which I want to put inside a VM and a GT 750 to use in the host.

AMD-Vi working:

lz@z:~$ dmesg |grep AMD-Vi
[    0.327637] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.330500] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.330501] pci 0000:00:00.2: AMD-Vi: Extended features (0xf77ef22294ada):
[    0.330504] AMD-Vi: Interrupt remapping enabled
[    0.330505] AMD-Vi: Virtual APIC enabled
[    0.330572] AMD-Vi: Lazy IO/TLB flushing enabled

Here are my IOMMU groups:

IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 10 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 11 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU Group 11 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 12 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU Group 12 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU Group 12 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU Group 12 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU Group 12 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU Group 12 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU Group 12 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU Group 12 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU Group 13 01:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:2263] (rev 03)
IOMMU Group 14 02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01)
IOMMU Group 14 02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01)
IOMMU Group 14 02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01)
IOMMU Group 14 03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01)
IOMMU Group 14 05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
IOMMU Group 14 06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0f02] (rev a1)
IOMMU Group 14 06:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1)
>>>>>>>>>>>>>>> IOMMU Group 15 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
IOMMU Group 16 08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
IOMMU Group 17 08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU Group 18 08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f]
IOMMU Group 19 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
IOMMU Group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 20 09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 21 09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU Group 2 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 3 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 4 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 6 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 7 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 8 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 9 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]

You can see that my GTX1060 is in group 15, along with other things which I don't care, they can go inside the VM too. The USB controller for example.

Soi I have 10de:2184 (GTX 1060) and 10de:1aeb (GTX Audio). Do I need to save the IDs of the other things in group 15? Im gonna try to do with all of them, so I save 10de:1aec (USB) and 10de:1aed (Serial BUS)

lz@z:~$ cat /etc/initramfs-tools/modules 
# List of modules that you want to include in your initramfs.
# They will be loaded at boot time in the order below.
#
# Syntax:  module_name [args ...]
#
# You must run update-initramfs(8) to effect this change.
#
# Examples:
#
# raid1
# sd_mod
vfio vfio_iommu_type1 vfio_virqfd vfio_pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed

and

lz@z:~$ cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

vfio vfio_iommu_type1 vfio_pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed

and

lz@z:~$ cat /etc/modprobe.d/vfio.conf 
options vfio-pci ids=10de:2184,10de:1aeb,10de:1aec,10de:1aed

and

lz@z:~$ cat /etc/modprobe.d/kvm.conf 
options kvm ignore_msrs=1

Now take a look at my lspci after reboot:

lz@z:~$ lspci -nnv
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex [1022:1450]
    Flags: fast devsel

00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) I/O Memory Management Unit [1022:1451]
    Flags: fast devsel, IRQ 25
    Capabilities: <access denied>

00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 26
    Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: f7600000-f76fffff [size=1M]
    Prefetchable memory behind bridge: None
    Capabilities: <access denied>
    Kernel driver in use: pcieport

00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 27
    Bus: primary=00, secondary=02, subordinate=06, sec-latency=0
    I/O behind bridge: 0000d000-0000efff [size=8K]
    Memory behind bridge: f4000000-f53fffff [size=20M]
    Prefetchable memory behind bridge: 00000000e8000000-00000000f21fffff [size=162M]
    Capabilities: <access denied>
    Kernel driver in use: pcieport

00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 28
    Bus: primary=00, secondary=07, subordinate=07, sec-latency=0
    I/O behind bridge: 0000f000-0000ffff [size=4K]
    Memory behind bridge: f6000000-f70fffff [size=17M]
    Prefetchable memory behind bridge: 00000000d0000000-00000000e20fffff [size=289M]
    Capabilities: <access denied>
    Kernel driver in use: pcieport

00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 29
    Bus: primary=00, secondary=08, subordinate=08, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: f7200000-f74fffff [size=3M]
    Prefetchable memory behind bridge: None
    Capabilities: <access denied>
    Kernel driver in use: pcieport

00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge [1022:1452]
    Flags: fast devsel

00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454] (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 31
    Bus: primary=00, secondary=09, subordinate=09, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: f7500000-f75fffff [size=1M]
    Prefetchable memory behind bridge: None
    Capabilities: <access denied>
    Kernel driver in use: pcieport

00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
    Subsystem: Gigabyte Technology Co., Ltd FCH SMBus Controller [1458:5001]
    Flags: 66MHz, medium devsel
    Kernel modules: i2c_piix4, sp5100_tco

00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
    Subsystem: Gigabyte Technology Co., Ltd FCH LPC Bridge [1458:5001]
    Flags: bus master, 66MHz, medium devsel, latency 0

00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
    Flags: fast devsel

00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
    Flags: fast devsel

00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
    Flags: fast devsel

00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
    Flags: fast devsel
    Kernel driver in use: k10temp
    Kernel modules: k10temp

00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
    Flags: fast devsel

00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
    Flags: fast devsel

00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
    Flags: fast devsel

00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
    Flags: fast devsel

01:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. Device [2646:2263] (rev 03) (prog-if 02 [NVM Express])
    Subsystem: Kingston Technology Company, Inc. Device [2646:2263]
    Flags: bus master, fast devsel, latency 0, IRQ 60, NUMA node 0
    Memory at f7600000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: nvme
    Kernel modules: nvme

02:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset USB 3.1 XHCI Controller [1022:43d5] (rev 01) (prog-if 30 [XHCI])
    Subsystem: ASMedia Technology Inc. 400 Series Chipset USB 3.1 XHCI Controller [1b21:1142]
    Flags: bus master, fast devsel, latency 0, IRQ 30
    Memory at f53a0000 (64-bit, non-prefetchable) [size=32K]
    Capabilities: <access denied>
    Kernel driver in use: xhci_hcd

02:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset SATA Controller [1022:43c8] (rev 01) (prog-if 01 [AHCI 1.0])
    Subsystem: ASMedia Technology Inc. 400 Series Chipset SATA Controller [1b21:1062]
    Flags: bus master, fast devsel, latency 0, IRQ 59
    Memory at f5380000 (32-bit, non-prefetchable) [size=128K]
    Expansion ROM at f5300000 [disabled] [size=512K]
    Capabilities: <access denied>
    Kernel driver in use: ahci
    Kernel modules: ahci

02:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Bridge [1022:43c6] (rev 01) (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 33
    Bus: primary=02, secondary=03, subordinate=06, sec-latency=0
    I/O behind bridge: 0000d000-0000efff [size=8K]
    Memory behind bridge: f4000000-f52fffff [size=19M]
    Prefetchable memory behind bridge: 00000000e8000000-00000000f21fffff [size=162M]
    Capabilities: <access denied>
    Kernel driver in use: pcieport

03:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
    DeviceName: Broadcom 5762
    Flags: bus master, fast devsel, latency 0, IRQ 34
    Bus: primary=03, secondary=04, subordinate=04, sec-latency=0
    I/O behind bridge: None
    Memory behind bridge: None
    Prefetchable memory behind bridge: None
    Capabilities: <access denied>
    Kernel driver in use: pcieport

03:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 36
    Bus: primary=03, secondary=05, subordinate=05, sec-latency=0
    I/O behind bridge: 0000e000-0000efff [size=4K]
    Memory behind bridge: f5200000-f52fffff [size=1M]
    Prefetchable memory behind bridge: 00000000f2100000-00000000f21fffff [size=1M]
    Capabilities: <access denied>
    Kernel driver in use: pcieport

03:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 400 Series Chipset PCIe Port [1022:43c7] (rev 01) (prog-if 00 [Normal decode])
    Flags: bus master, fast devsel, latency 0, IRQ 37
    Bus: primary=03, secondary=06, subordinate=06, sec-latency=0
    I/O behind bridge: 0000d000-0000dfff [size=4K]
    Memory behind bridge: f4000000-f50fffff [size=17M]
    Prefetchable memory behind bridge: 00000000e8000000-00000000f1ffffff [size=160M]
    Capabilities: <access denied>
    Kernel driver in use: pcieport

05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 0c)
    Subsystem: Gigabyte Technology Co., Ltd Onboard Ethernet [1458:e000]
    Flags: bus master, fast devsel, latency 0, IRQ 35
    I/O ports at e000 [size=256]
    Memory at f5200000 (64-bit, non-prefetchable) [size=4K]
    Memory at f2100000 (64-bit, prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: r8169
    Kernel modules: r8169

06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0f02] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation GF108 [GeForce GT 730] [10de:0825]
    Flags: bus master, fast devsel, latency 0, IRQ 86
    Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
    Memory at e8000000 (64-bit, prefetchable) [size=128M]
    Memory at f0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at d000 [size=128]
    Expansion ROM at f5000000 [disabled] [size=512K]
    Capabilities: <access denied>
    Kernel driver in use: nouveau
    Kernel modules: nvidiafb, nouveau

06:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0bea] (rev a1)
    Subsystem: NVIDIA Corporation GF108 High Definition Audio Controller [10de:0825]
    Flags: bus master, fast devsel, latency 0, IRQ 35
    Memory at f5080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

>>>>>>>>>>>>>>>> 07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 11
    Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
    Memory at d0000000 (64-bit, prefetchable) [size=256M]
    Memory at e0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at f000 [size=128]
    Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau

>>>>>>>>>>>>>>>> 07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 83
    Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

>>>>>>>>>>>>>>>> 07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1) (prog-if 30 [XHCI])
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 47
    Memory at e2000000 (64-bit, prefetchable) [size=256K]
    Memory at e2040000 (64-bit, prefetchable) [size=64K]
    Capabilities: <access denied>
    Kernel driver in use: xhci_hcd

>>>>>>>>>>>>>>>> 07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 58
    Memory at f7084000 (32-bit, non-prefetchable) [size=4K]
    Capabilities: <access denied>
    Kernel driver in use: nvidia-gpu
    Kernel modules: i2c_nvidia_gpu

08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function [1022:145a]
    Flags: fast devsel
    Capabilities: <access denied>

08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
    Flags: bus master, fast devsel, latency 0, IRQ 80
    Memory at f7300000 (32-bit, non-prefetchable) [size=1M]
    Memory at f7400000 (32-bit, non-prefetchable) [size=8K]
    Capabilities: <access denied>
    Kernel driver in use: ccp
    Kernel modules: ccp

08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller [1022:145f] (prog-if 30 [XHCI])
    Subsystem: Gigabyte Technology Co., Ltd Zeppelin USB 3.0 Host controller [1458:5007]
    Flags: bus master, fast devsel, latency 0, IRQ 48
    Memory at f7200000 (64-bit, non-prefetchable) [size=1M]
    Capabilities: <access denied>
    Kernel driver in use: xhci_hcd

09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
    Subsystem: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function [1022:1455]
    Flags: fast devsel
    Capabilities: <access denied>

09:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) (prog-if 01 [AHCI 1.0])
    Subsystem: Gigabyte Technology Co., Ltd FCH SATA Controller [AHCI mode] [1458:b002]
    Flags: bus master, fast devsel, latency 0, IRQ 63
    Memory at f7508000 (32-bit, non-prefetchable) [size=4K]
    Capabilities: <access denied>
    Kernel driver in use: ahci
    Kernel modules: ahci

09:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
    Subsystem: Gigabyte Technology Co., Ltd Family 17h (Models 00h-0fh) HD Audio Controller [1458:a182]
    Flags: bus master, fast devsel, latency 0, IRQ 85
    Memory at f7500000 (32-bit, non-prefetchable) [size=32K]
    Capabilities: <access denied>
    Kernel driver in use: snd_hda_intel
    Kernel modules: snd_hda_intel

I highlithed the devices which are in group 15. Only the NVIDIA GTX 1060 is being used by vfio-pci, the others are in use by other kernel modules. Is this the source of the problem? In order to pass the GTX, I must pass everything in group 15, but these other things are being used by other drivers, not vfio-pci.

Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: 2020-02-19T22:48:02.001713Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.002255Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.002845Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.003340Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.003842Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21]
2020-02-19T22:48:02.024485Z qemu-system-x86_64: -device vfio-pci,host=07:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:07:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.'

Plase take a look at

Please ensure all devices within the iommu_group are bound to their vfio bus driver

This confirms what I thougth, not all devices are being helf by vfio-pci, even though I explicitly said to them.

I think he does that in this part, but for the nvidia driver:

In order to alter the load sequence in favour to vfio_pci before the
nvidia driver, create a file in the modprobe.d folder via sudo nano
/etc/modprobe.d/nvidia.conf and add the the following line:

softdep nouveau pre: vfio-pci softdep nvidia pre: vfio-pci softdep
nvidia* pre: vfio-pci

Is there a way to do the same but for noveau?

Best Answer

I discovered that there's a way to manually unbind kernel modules for specific devices in the pci so I did this little script

echo -n "0000:07:00.1" > /sys/bus/pci/drivers/snd_hda_intel/unbind
echo -n "0000:07:00.1" > /sys/bus/pci/drivers/vfio-pci/bind

echo -n "0000:07:00.2" > /sys/bus/pci/drivers/xhci_hcd/unbind
echo -n "0000:07:00.2" > /sys/bus/pci/drivers/vfio-pci/bind

echo -n "0000:07:00.3" > /sys/bus/pci/drivers/nvidia-gpu/unbind
echo -n "0000:07:00.3" > /sys/bus/pci/drivers/vfio-pci/bind

It hangs for a while (like 2 minutes) because of the echo -n "0000:07:00.3" > /sys/bus/pci/drivers/nvidia-gpu/unbind line but when it finishes, this is the output of lspci -nnv:

7:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:2184] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation TU116 [GeForce GTX 1660] [10de:1324]
    Flags: bus master, fast devsel, latency 0, IRQ 11
    Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
    Memory at d0000000 (64-bit, prefetchable) [size=256M]
    Memory at e0000000 (64-bit, prefetchable) [size=32M]
    I/O ports at f000 [size=128]
    Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci
    Kernel modules: nvidiafb, nouveau

07:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:1aeb] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 83
    Memory at f7080000 (32-bit, non-prefetchable) [size=16K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci
    Kernel modules: snd_hda_intel

07:00.2 USB controller [0c03]: NVIDIA Corporation Device [10de:1aec] (rev a1) (prog-if 30 [XHCI])
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 46
    Memory at e2000000 (64-bit, prefetchable) [size=256K]
    Memory at e2040000 (64-bit, prefetchable) [size=64K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci

07:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device [10de:1aed] (rev a1)
    Subsystem: NVIDIA Corporation Device [10de:1324]
    Flags: fast devsel, IRQ 58
    Memory at f7084000 (32-bit, non-prefetchable) [size=4K]
    Capabilities: <access denied>
    Kernel driver in use: vfio-pci
    Kernel modules: i2c_nvidia_gpu

As you can see, all them are using vfio-pci. Then I simply added the GPU to virt-manager and it worked. However I'm still investigating why, in the middle of the windows 10 installation, the entire ubuntu froze forever.

UPDATE:

manually unbinding works to unding the GPU but if you have to unbind this means that the linux driver for the GPU already touched the GPU, so now the GPU knows it was on linux. When you bind it to the VM and start the VM, the Windows driver for the GPU will read the GPU state and know somebody (linux) messed with it before and thus will refuse to work because NVIDIA sucks.

Don't unbind manually, or at least try but it will probably not work. Instead make sure linux drivers never touch the GPU

Related Question