Ubuntu – No CUDA-capable device is detected although requirements are installed

16.04cudanvidia

Problem

I just installed cuda following the official installations instructions via the .deb file. When it comes to section 6.2.2.3 (running deviceQuery) I get the message that no CUDA-capable device was found although I'm pretty sure everything is setup correctly:

$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL

System information

Here is some information about my system:

$ uname -m && cat /etc/*release
x86_64
DISTRIB_RELEASE=16.04
DISTRIB_DESCRIPTION="Ubuntu 16.04.2 LTS"
VERSION="16.04.2 LTS (Xenial Xerus)"

$ uname -r
4.4.0-64-generic

$ lspci | grep -i nvidia
08:00.0 3D controller: NVIDIA Corporation GK208M [GeForce 920M] (rev a1)

$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

I also verified the kernel headers are installed:

$ sudo apt-get install linux-headers-$(uname -r)
linux-headers-4.4.0-64-generic is already the newest version (4.4.0-64.85).

Installation of CUDA

So my system meets all the prerequisites. I then followed the instructions for the installation via apt-get (I installed cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb).

PATH and LD_LIBRARY_PATH are set to point to the required locations:

$ echo $PATH
/usr/local/cuda-8.0/bin:[...]

$ echo $LD_LIBRARY_PATH 
/usr/local/cuda-8.0/lib64

Note that I did setup up LD_LIBRARY_PATH manually although this was mentioned to be necessary only for the runfile installation. However the error persists when resetting LD_LIBRARY_PATH.

The NVIDIA drivers also seem to be up-to-date:

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  367.57  Mon Oct  3 20:37:01 PDT 2016
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

Information about the cuda compiler driver:

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

The instructions mention that this could be a problem with file permission:

If a CUDA-capable device and the CUDA Driver are installed but deviceQuery reports that no CUDA-capable devices are present, this likely means that the /dev/nvidia* files are missing or have the wrong permissions.

Those files didn't have the execution flag which I then added:

$ ls -al /dev/nvidia*
crwxrwxrwx 1 root root 195,   0 Feb 27 13:17 /dev/nvidia0
crwxrwxrwx 1 root root 195, 255 Feb 27 13:17 /dev/nvidiactl
crwxrwxrwx 1 root root 195, 254 Feb 27 13:17 /dev/nvidia-modeset
crwxrwxrwx 1 root root 243,   0 Feb 27 13:17 /dev/nvidia-uvm
crwxrwxrwx 1 root root 243,   1 Feb 27 18:24 /dev/nvidia-uvm-tools

However after running deviceQuery (which still fails) some of the permissions are reset:

$ ls -al /dev/nvidia*
crwxrwxrwx 1 root root 195,   0 Feb 27 13:17 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Feb 27 13:17 /dev/nvidiactl
crwxrwxrwx 1 root root 195, 254 Feb 27 13:17 /dev/nvidia-modeset
crw-rw-rw- 1 root root 243,   0 Feb 27 13:17 /dev/nvidia-uvm
crw-rw-rw- 1 root root 243,   1 Feb 27 18:24 /dev/nvidia-uvm-tools

That's a bit puzzling especially because I'm running deviceQuery without sudo.

Maybe related

Samples build fails

When I try to build the cuda samples via make it fails for one of them with the message

/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
Makefile:381: recipe for target 'cudaDecodeGL' failed
make[1]: *** [cudaDecodeGL] Error 1

Which indeed seems to be missing:

$ ls /usr/local/cuda-8.0/lib64/libnvcuvid
ls: cannot access '/usr/local/cuda-8.0/lib64/libnvcuvid': No such file or directory

Although the corresponding header file is there:

$ ls /usr/local/cuda-8.0/targets/x86_64-linux/include/nvcuvid.h 
/usr/local/cuda-8.0/targets/x86_64-linux/include/nvcuvid.h

Problem with static linking

The error which is raised from deviceQuery suggests a problem with static linking:

CUDA Device Query (Runtime API) version (CUDART static linking)

AFAIK LD_LIBRARY_PATH is only responsible for dynamic linking. I found this question where a suggestion is to include /usr/lib/nvidia-current to the linker path. However this directory doesn't exist within my installation:

$ ls /usr/lib/nvidia-current
ls: cannot access '/usr/lib/nvidia-current': No such file or directory

Best Answer

Looks like you are on a laptop with Nvidia Optimus, have you switched to nvidia using prime-select nvidia

Lesson learned

Before to install CUDA, make sure all drivers of NVIDIA are running OK! You may install them like @ubfan1 suggest in this link.

Execute the following commands to check if the installation is the default and it is running.

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  375.82  Wed Jul 19 21:16:49 PDT 2017
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

$ sudo lshw -c video
[sudo] password for marlosdamasceno: 
  *-display               
       description: 3D controller
       product: NVIDIA Corporation
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress bus_master cap_list
       configuration: driver=nvidia latency=0
       resources: irq:321 memory:a3000000-a3ffffff memory:90000000-9fffffff memory:a0000000-a1ffffff ioport:4000(size=128)
  *-display
       description: VGA compatible controller
       product: Intel Corporation
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 04
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list rom
       configuration: driver=i915 latency=0
       resources: irq:318 memory:a2000000-a2ffffff memory:b0000000-bfffffff ioport:5000(size=64) memory:c0000-dffff

$ lsmod | grep nvidia
nvidia_uvm            647168  0
nvidia_drm             45056  2
nvidia_modeset        790528  5 nvidia_drm
nvidia              12701696  85 nvidia_modeset,nvidia_uvm
drm_kms_helper        151552  2 i915,nvidia_drm
drm                   352256  6 i915,nvidia_drm,drm_kms_helper


$ nvidia-smi
Fri Sep  8 19:47:17 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.82                 Driver Version: 375.82                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 0000:01:00.0     Off |                  N/A |
| N/A   49C    P0    N/A /  N/A |    536MiB /  4041MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       996    G   /usr/lib/xorg/Xorg                             271MiB |
|    0      1827    G   compiz                                         177MiB |
|    0      2351    G   ...el-token=FDDD25D3486FDA0AB5CD0952493279C6    86MiB |
|    0     14381    G   unity-control-center                             1MiB |
+-----------------------------------------------------------------------------+

Just to check the secure boot you can run.

$ mokutil --sb-state
SecureBoot enabled

Ubuntu – Ubuntu can’t login after set LD_LIBRARY_PATH for CUDA

I didn't solve the problem. But I have a workaround for you.

1. edit /etc/default/grub

Modify GRUB_CMDLINE_LINUX_DEFAULT to

GRUB_CMDLINE_LINUX_DEFAULT='pcie_port_pm=off acpi_backlight=none acpi_osi=Linux acpi_osi=! acpi_osi="Windows 2009"'

This step is to prevent blank screen after logging in.

2. move nvidia library directories to /etc/ld.so.conf.d/nvidia.conf

The content of nvidia.conf is

/usr/lib/nvidia-390
/usr/lib32/nvidia-390

These directories depends on driver version on your computer.

3. create /etc/init.d/nvidia

To disable and enable nvidia runtime libraries.

#!/bin/sh
### BEGIN INIT INFO
# Provides:          nvidia 
# Required-Start:    $all
# Required-Stop:     $all
# Default-Start:     5
# Default-Stop:      0 6
# Short-Description: load/unload nvidia library
# Description:       load/unload nvidia library
### END INIT INFO

PRIME=$(prime-select query)
if [ "$PRIME" = "nvidia" ]; then
    exit 0
fi

case "$1" in
  start)
    sleep 10
    cd /etc/ld.so.conf.d
    mv nvidia.conf.bak nvidia.conf
    ldconfig
    nvidia-smi
    ;;
  stop)
    cd /etc/ld.so.conf.d
    mv nvidia.conf nvidia.conf.bak
    ldconfig
esac

4. execute update-rc.d nvidia defaults

You should find SXXnvidia in /etc/rc5.d/ and KXXnvidia in /etc/rc6.d/, /etc/rc0.d/.

Try to execute /etc/init.d/nvidia stop and nvidia-smi, you should see error messages of libraries not found.

Try to execute /etc/init.d/nvidia start, then nvidia-smi is fine again.

If everything is OK, you can reboot now. You are expected to login to desktop.

5. If anything goes wrong

The most possible problem is nvidia script not executed. If it happens, you can press Ctrl+Alt+F1 to tty mode, execute /etc/init.d/nvidia stop; reboot. Then you can go back to unity desktop to debug.

6. known side-effect

When use intel as prime GPU, unity-control-center(system settings) will be failed to start.

GLib-CRITICAL **: g_strsplit: assertion `string != NULL' failed.

Note: my system spec

# uname -r
4.13.0-32-generic
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:    16.04
Codename:   xenial
# dpkg -l | grep cuda
ii  cuda-9-0                                    9.0.176-1                                    amd64        CUDA 9.0 meta-package
ii  cuda-command-line-tools-9-0                 9.0.176-1                                    amd64        CUDA command-line tools
ii  cuda-core-9-0                               9.0.176-1                                    amd64        CUDA core tools
ii  cuda-cublas-9-0                             9.0.176.1-1                                  amd64        CUBLAS native runtime libraries
ii  cuda-cublas-dev-9-0                         9.0.176.1-1                                  amd64        CUBLAS native dev links, headers
ii  cuda-cudart-9-0                             9.0.176-1                                    amd64        CUDA Runtime native Libraries
ii  cuda-cudart-dev-9-0                         9.0.176-1                                    amd64        CUDA Runtime native dev links, headers
ii  cuda-cufft-9-0                              9.0.176-1                                    amd64        CUFFT native runtime libraries
ii  cuda-cufft-dev-9-0                          9.0.176-1                                    amd64        CUFFT native dev links, headers
ii  cuda-curand-9-0                             9.0.176-1                                    amd64        CURAND native runtime libraries
ii  cuda-curand-dev-9-0                         9.0.176-1                                    amd64        CURAND native dev links, headers
ii  cuda-cusolver-9-0                           9.0.176-1                                    amd64        CUDA solver native runtime libraries
ii  cuda-cusolver-dev-9-0                       9.0.176-1                                    amd64        CUDA solver native dev links, headers
ii  cuda-cusparse-9-0                           9.0.176-1                                    amd64        CUSPARSE native runtime libraries
ii  cuda-cusparse-dev-9-0                       9.0.176-1                                    amd64        CUSPARSE native dev links, headers
ii  cuda-demo-suite-9-0                         9.0.176-1                                    amd64        Demo suite for CUDA
ii  cuda-documentation-9-0                      9.0.176-1                                    amd64        CUDA documentation
ii  cuda-driver-dev-9-0                         9.0.176-1                                    amd64        CUDA Driver native dev stub library
ii  cuda-drivers                                390.12-1                                     amd64        CUDA Driver meta-package
ii  cuda-libraries-9-0                          9.0.176-1                                    amd64        CUDA Libraries 9.0 meta-package
ii  cuda-libraries-dev-9-0                      9.0.176-1                                    amd64        CUDA Libraries 9.0 development meta-package
ii  cuda-license-9-0                            9.0.176-1                                    amd64        CUDA licenses
ii  cuda-misc-headers-9-0                       9.0.176-1                                    amd64        CUDA miscellaneous headers
ii  cuda-npp-9-0                                9.0.176-1                                    amd64        NPP native runtime libraries
ii  cuda-npp-dev-9-0                            9.0.176-1                                    amd64        NPP native dev links, headers
ii  cuda-nvgraph-9-0                            9.0.176-1                                    amd64        NVGRAPH native runtime libraries
ii  cuda-nvgraph-dev-9-0                        9.0.176-1                                    amd64        NVGRAPH native dev links, headers
ii  cuda-nvml-dev-9-0                           9.0.176-1                                    amd64        NVML native dev links, headers
ii  cuda-nvrtc-9-0                              9.0.176-1                                    amd64        NVRTC native runtime libraries
ii  cuda-nvrtc-dev-9-0                          9.0.176-1                                    amd64        NVRTC native dev links, headers
ii  cuda-repo-ubuntu1604                        9.1.85-1                                     amd64        cuda repository configuration files
ii  cuda-runtime-9-0                            9.0.176-1                                    amd64        CUDA Runtime 9.0 meta-package
ii  cuda-samples-9-0                            9.0.176-1                                    amd64        CUDA example applications
ii  cuda-toolkit-9-0                            9.0.176-1                                    amd64        CUDA Toolkit 9.0 meta-package
ii  cuda-visual-tools-9-0                       9.0.176-1                                    amd64        CUDA visual tools
ii  libcuda1-390                                390.12-0ubuntu1                              amd64        NVIDIA CUDA runtime library
ii  libcudnn7                                   7.0.5.15-1+cuda9.0                           amd64        cuDNN runtime libraries
ii  libcudnn7-dev                               7.0.5.15-1+cuda9.0                           amd64        cuDNN development libraries and headers
# dpkg -l | grep nvidia
ii  nvidia-390                                  390.12-0ubuntu1                              amd64        NVIDIA binary driver - version 390.12
ii  nvidia-390-dev                              390.12-0ubuntu1                              amd64        NVIDIA binary Xorg driver development files
ii  nvidia-modprobe                             390.12-0ubuntu1                              amd64        Load the NVIDIA kernel driver and create device files
ii  nvidia-opencl-icd-390                       390.12-0ubuntu1                              amd64        NVIDIA OpenCL ICD
ii  nvidia-prime                                0.8.2                                        amd64        Tools to enable NVIDIA's Prime
ii  nvidia-settings                             390.12-0ubuntu1                              amd64        Tool for configuring the NVIDIA graphics driver