Ubuntu – How to run OpenCL program in Docker container with AMDGPU-Pro

amd-graphicsdockerdrivers

I have a fresh Ubuntu 16.04 installation, with only the AMDGPU-Pro (proprietary) driver installed and docker-engine (from the apt.dockerproject.org PPA).

I want to run OpenCL programs inside Docker containers. I figure: the kernel is shared, so the GPU module(s) should be available for the containers, what is missing is the libs to access the module(s).

I set up a container and compiled an OpenCL program inside it. Running the program inside it returns that no devices were found. So I copied the binary to the host, and executing the binary there works (both my GPU devices were detected).

I attempted to create a fresh container (from Ubuntu:16.04) and copied the binary, and all libs it required from the container used for compilation, AND the folder /usr/lib/x86_64-linux-gnu/amdgpu-pro from the host.

Unfortunately, this also didn't work. What could I be missing?

Best Answer

Managed to get it to work. Summary:

  • Need to add to the container the libraries from /usr/lib/x86_64-linux-gnu/amdgpu-pro
  • Need to add to the container the configuration files from /etc/OpenCL
  • Need to allow the container to access the /dev/dri device

Here's an example script to build the docker image: https://gist.github.com/anonymous/fea9c0a9e986eeda7cf58e47f47c89f2

And here's an example command to run a container with the created image:

docker run -it --device /dev/dri:/dev/dri climage

In case anyone stumbles into a similar problem, I'll also list how I found out the "solution":

  • Run the test binary in the host using strace to list all syscalls
    • strace ./cltest &> host.strace
  • Run the test binary in the container, also using strace to list all syscalls
    • docker run --rm --security-opt seccomp:unconfined -v $(pwd):/external climage strace /external/cltest &> ./container.strace
  • Compare the two outputs, either manually or using something like vimdiff
    • vimdiff container.strace host.strace
  • See where command results differ, in some cases I had files not being found, so I added them, in other cases the container didn't have permission to open the file (which was in /dev/dri, so I allowed it to access the device)
Related Question