Installing CUDA 10.1 on Ubuntu 20.04 – Root Folder Empty Issue

20.04cudadriversnvidia

I'm trying to use the default way to install CUDA on Ubuntu 20.04. However, after I finished all the steps successfully, I found the root folder of CUDA is almost empty (the actual size is 24kb). May I ask where are the files under this folder? And where I can find them? Thank you in advance!

I install the CUDA by:

$ sudo apt-get install nvidia-cuda-toolkit

And these are all the file paths of CUDA

$ locate cuda | grep /cuda$

/usr/include/thrust/system/cuda
/usr/lib/cuda
/usr/share/doc/libthrust-dev/examples/cuda

As you can see, all the files under it are missing

$ sudo du -sh /usr/lib/cuda
24K /usr/lib/cuda

$ ls
bin  include  lib64  nvvm  version.txt

And this is my driver version:

$ nvidia-smi
Sat Jul  4 15:53:54 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100      Driver Version: 440.100      CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K600         Off  | 00000000:02:00.0  On |                  N/A |
| 25%   51C    P8    N/A /  N/A |    333MiB /   979MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 106...  Off  | 00000000:03:00.0 Off |                  N/A |
|  0%   41C    P8     4W / 120W |      2MiB /  6078MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Best Answer

CUDA 10.2 on Ubuntu 20.04, Kernel 5.4.0-40

This procedure avoids any package manager involvement, and allows you to keep the tested version 440 Nvidia drivers. The way Nvidia packages the deb files changes all the time, so this may not work for CUDA 11, but should give you an idea of what needs to be done.

Update your Ubuntu 20.04 Nvidia drivers to the latest (tested) versions (440 as of Jul. 2, 2020) Run Software and Updates, then select the Additional Drivers tab, and make your Nvidia selection. Various software like compilers and kernel headers should already be present for the Nvidia module to be build. When built, reboot and run nvdia-smi to endure you are running your selected Nvidia drivers.

In your browser, go to the Nvidia site, and select your CUDA version to download.

https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal

A window will open with a Base Installer script, just copy the wget line and get the offered 1.8GB .deb file into the directory of your choice..
wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb

Hashcheck the downloaded deb file with md5sum.

md5sum cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
4dfcc4d2bcca28e2f4b40f54171374ec  cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb

and check it against the supplied checksums at the "Installer Checksums" link under the script.

Unpack the .deb file (the contents are just other deb files).
Avoid unpacking the Nvidia deb files, but unpack all the others. All the nvidia files or libxnv... files but one, should already be installed, with a standard Nvidia driver install from "Software and Updates". The one to maually install is libxnvctrl-dev, which will likely have an earlier version than the system one.

cd to your cuda location and run:

dpkg-deb --extract cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb cuda102

Selecting an "install" directory, like cuda102, will allow other cuda versions to be installed in parallel if necessary. Depending upon the CUDA release, the deb files may be in further subdirectories. I found it useful to create a directory to contain the final setup of files copied from the "install" directory. e.g, unpack the deb files into /usr/local/data/cuda/cuda102, then use /usr/local/data/cuda/cuda-10.2 as the final setup location. The final setup will not have the deep directories of the "install" directory.

Add a link from /usr/local named cuda-10.2 to wherever you unpacked the debs. e.g.:

sudo ln -s /usr/local/data/cuda/cuda-10.2 /usr/local/cuda-10.2

sudo apt-get install libxnvctrl-dev nvidia-headless-440 nvidia-headless-no-dkms-440 nvidia-modprobe
sudo apt-get install libglu1-mesa-dev freeglut3-dev

You will have etc, usr, and var directories created, with further subdirectories containing more .deb files in the var and usr directories.

cd var/cuda-repo-10-2-local-10.2.89-440.33.01 you will see about 70 deb files, with about 20 nvidia ones to delete (or ignore).

The cuda-compat-10-2_440.33.01-1_amd64.deb has no equivalent, and has some programs with wired in 440.33.... version names, so leave it and hope it works. The cuda-drivers_440.33.01-1_amd64.deb has nothing but a changelog.

for f in *deb do
 echo "Unpacking deb $f"
 dpkg-deb --extract "$f" .
done

See the "Installation Guide" link just below the script. It will have the system requirements which you should install yourself, since you will not be running the supplied installers.

Move the contents of the deeply embedded ...cuda-10.2 up to the install directory (cuda102). There are no conflicts. (Just move the cuda-10.2 directory up to cuda, for now.) ...cuda102/var/cuda-repo-10-2-local-10.2.89-440.33.01/usr/local/cuda-10.2

Now collect all the random ...why bother, just leave it until needed if ever. var/c*/src fortran files already present in the cuda-10.2 src. delete them. var/c*/include/cublas h files, copy over to the cuda-10.2/include and delete them. var/c*/share/* move all to the cuda-10.2/share (2 dirs, no overlap), and delete the share dir. leave the lib/pkgconfig dir until needed . leave the opt dir, only has some nvidia nsite stuff of unknown utility. leave the etc with a config file with /usr/local/cuda-10.2/targets/x86_64-linux/lib unneeded with a proper LD_LIBRARY_PATH I would think.

You should have a location with all the CUDA libraries and binaries. Add the bin and lib64 to your PATH and LD_LIBRARY_PATH at their beginnings. Note the required gcc version for your CUDA selection -- the default 9.x version should work, but the samples' makefiles will be set up to limit to a specific, earlier, version. For current CUDA releases, the earlier compiler versions should be available in the standard repositories, and are already installed, except for g++-8.

sudo apt-get install g++-8 Ubuntu 20.04 supplies gcc-8, but gcc-9 is the default when "gcc" is invoked. Older CUDA releases may require a compiler older than those supplied in the standard repositories, so use an older Ubuntu release archive or get the source.

/usr/local/cuda-10.2/bin/../targets/x86_64-linux/include/crt/host_config.h:138:2:
 error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
  138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
      |  ^~~~~

The gcc-8 version supplied by default is 8.4, which does not trigger the error message. Forgetting to install g++-8 will cause misleading errors about gcc versions. Manually changing the HOST_COMPILER may work, but may lead to undefined symbols.

HOST_COMPILER=/usr/bin/gcc-8 make
undefined reference to symbol '_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE7compareEPKc@@GLIBCXX_3.4.21'

Add soft links to these ealier version tools (gcc, g++, nm, ar, ranlib) in your cuda/bin directory. Since the cuda/bin is first in your PATH, they should override the system defaults. Avoid using the update-alternatives mechanism to change the system's default compiler. Every kernel update needs to recompile parts of the Nvidia video driver, and an old compiler version is untested for this, and may not work.

If the samples are not included in your initial deb, get their deb, install them, and copy the samples directory to a writeable location, taking ownership. Try to make a sample, like 5_simulation/nbody. you should just have to type "make", and the compliles and loads should work, producing an executable, nbody. Run it ./nbody.

you can run the make file from the top level, and note any missing libraries for some samples. At least one sample, simpleDevice... seems to need a lot of memory, maybe more than available.

Best Answer

Related Solutions

Related Question