Ubuntu – No CUDA-capable device is detected although requirements are installed

16.04cudanvidia

Problem

I just installed cuda following the official installations instructions via the .deb file. When it comes to section 6.2.2.3 (running deviceQuery) I get the message that no CUDA-capable device was found although I'm pretty sure everything is setup correctly:

$ ./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected
Result = FAIL

System information

Here is some information about my system:

$ uname -m && cat /etc/*release
x86_64
DISTRIB_RELEASE=16.04
DISTRIB_DESCRIPTION="Ubuntu 16.04.2 LTS"
VERSION="16.04.2 LTS (Xenial Xerus)"

$ uname -r
4.4.0-64-generic

$ lspci | grep -i nvidia
08:00.0 3D controller: NVIDIA Corporation GK208M [GeForce 920M] (rev a1)

$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609

I also verified the kernel headers are installed:

$ sudo apt-get install linux-headers-$(uname -r)
linux-headers-4.4.0-64-generic is already the newest version (4.4.0-64.85).

Installation of CUDA

So my system meets all the prerequisites. I then followed the instructions for the installation via apt-get (I installed cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb).

PATH and LD_LIBRARY_PATH are set to point to the required locations:

$ echo $PATH
/usr/local/cuda-8.0/bin:[...]

$ echo $LD_LIBRARY_PATH 
/usr/local/cuda-8.0/lib64

Note that I did setup up LD_LIBRARY_PATH manually although this was mentioned to be necessary only for the runfile installation. However the error persists when resetting LD_LIBRARY_PATH.

The NVIDIA drivers also seem to be up-to-date:

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  367.57  Mon Oct  3 20:37:01 PDT 2016
GCC version:  gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

Information about the cuda compiler driver:

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

The instructions mention that this could be a problem with file permission:

If a CUDA-capable device and the CUDA Driver are installed but deviceQuery reports that no CUDA-capable devices are present, this likely means that the /dev/nvidia* files are missing or have the wrong permissions.

Those files didn't have the execution flag which I then added:

$ ls -al /dev/nvidia*
crwxrwxrwx 1 root root 195,   0 Feb 27 13:17 /dev/nvidia0
crwxrwxrwx 1 root root 195, 255 Feb 27 13:17 /dev/nvidiactl
crwxrwxrwx 1 root root 195, 254 Feb 27 13:17 /dev/nvidia-modeset
crwxrwxrwx 1 root root 243,   0 Feb 27 13:17 /dev/nvidia-uvm
crwxrwxrwx 1 root root 243,   1 Feb 27 18:24 /dev/nvidia-uvm-tools

However after running deviceQuery (which still fails) some of the permissions are reset:

$ ls -al /dev/nvidia*
crwxrwxrwx 1 root root 195,   0 Feb 27 13:17 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Feb 27 13:17 /dev/nvidiactl
crwxrwxrwx 1 root root 195, 254 Feb 27 13:17 /dev/nvidia-modeset
crw-rw-rw- 1 root root 243,   0 Feb 27 13:17 /dev/nvidia-uvm
crw-rw-rw- 1 root root 243,   1 Feb 27 18:24 /dev/nvidia-uvm-tools

That's a bit puzzling especially because I'm running deviceQuery without sudo.

Maybe related

Samples build fails

When I try to build the cuda samples via make it fails for one of them with the message

/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
Makefile:381: recipe for target 'cudaDecodeGL' failed
make[1]: *** [cudaDecodeGL] Error 1

Which indeed seems to be missing:

$ ls /usr/local/cuda-8.0/lib64/libnvcuvid
ls: cannot access '/usr/local/cuda-8.0/lib64/libnvcuvid': No such file or directory

Although the corresponding header file is there:

$ ls /usr/local/cuda-8.0/targets/x86_64-linux/include/nvcuvid.h 
/usr/local/cuda-8.0/targets/x86_64-linux/include/nvcuvid.h

Problem with static linking

The error which is raised from deviceQuery suggests a problem with static linking:

CUDA Device Query (Runtime API) version (CUDART static linking)

AFAIK LD_LIBRARY_PATH is only responsible for dynamic linking. I found this question where a suggestion is to include /usr/lib/nvidia-current to the linker path. However this directory doesn't exist within my installation:

$ ls /usr/lib/nvidia-current
ls: cannot access '/usr/lib/nvidia-current': No such file or directory

Best Answer

Looks like you are on a laptop with Nvidia Optimus, have you switched to nvidia using prime-select nvidia

Related Question