Linux – Fix ‘Resource Temporarily Unavailable’ Error When Creating Threads

dockerforklimitlinuxthread

I am running a docker server on Arch Linux (kernel 4.3.3-2) with several containers. Since my last reboot, both the docker server and random programs within the containers crash with a message about not being able to create a thread, or (less often) to fork. The specific error message is different depending on the program, but most of them seem to mention the specific error Resource temporarily unavailable. See at the end of this post for some example error messages.

Now there are plenty of people who have had this error message, and plenty of responses to them. What’s really frustrating is that everyone seems to be speculating how the issue could be resolved, but no one seems to point out how to identify which of the many possible causes for the problem is present.

I have collected these 5 possible causes for the error and how to verify that they are not present on my system:

  1. There is a system-wide limit on the number of threads configured in /proc/sys/kernel/threads-max (source). In my case this is set to 60613.
  2. Every thread takes some space in the stack. The stack size limit is configured using ulimit -s (source). The limit for my shell used to be 8192, but I have increased it by putting * soft stack 32768 into /etc/security/limits.conf, so it ulimit -s now returns 32768. I have also increased it for the docker process by putting LimitSTACK=33554432 into /etc/systemd/system/docker.service (source, and I verified that the limit applies by looking into /proc/<pid of docker>/limits and by running ulimit -s inside a docker container.
  3. Every thread takes some memory. A virtual memory limit is configured using ulimit -v. On my system it is set to unlimited, and 80% of my 3 GB of memory are free.
  4. There is a limit on the number of processes using ulimit -u. Threads count as processes in this case (source). On my system, the limit is set to 30306, and for the docker daemon and inside docker containers, the limit is 1048576. The number of currently running threads can be found out by running ls -1d /proc/*/task/* | wc -l or by running ps -elfT | wc -l (source). On my system they are between 700 and 800.
  5. There is a limit on the number of open files, which according to some sources is also relevant when creating threads. The limit is configured using ulimit -n. On my system and inside docker, the limit is set to 1048576. The number of open files can be found out using lsof | wc -l (source), on my system it is about 30000.

It looks like before the last reboot I was running kernel 4.2.5-1, now I’m running 4.3.3-2. Downgrading to 4.2.5-1 fixes all the problems. Other posts mentioning the problem are this and this. I have opened a bug report for Arch Linux.

What has changed in the kernel that could be causing this?


Here are some example error messages:

Crash dump was written to: erl_crash.dump
Failed to create aux thread

 

Jan 07 14:37:25 edeltraud docker[30625]: runtime/cgo: pthread_create failed: Resource temporarily unavailable

 

dpkg: unrecoverable fatal error, aborting:
 fork failed: Resource temporarily unavailable
E: Sub-process /usr/bin/dpkg returned an error code (2)

 

test -z "/usr/include" || /usr/sbin/mkdir -p "/tmp/lib32-popt/pkg/lib32-popt/usr/include"
/bin/sh: fork: retry: Resource temporarily unavailable
 /usr/bin/install -c -m 644 popt.h '/tmp/lib32-popt/pkg/lib32-popt/usr/include'
test -z "/usr/share/man/man3" || /usr/sbin/mkdir -p "/tmp/lib32-popt/pkg/lib32-popt/usr/share/man/man3"
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: Resource temporarily unavailable
/bin/sh: fork: Resource temporarily unavailable
make[3]: *** [install-man3] Error 254

 

Jan 07 11:04:39 edeltraud docker[780]: time="2016-01-07T11:04:39.986684617+01:00" level=error msg="Error running container: [8] System error: fork/exec /proc/self/exe: resource temporarily unavailable"

 

[Wed Jan 06 23:20:33.701287 2016] [mpm_event:alert] [pid 217:tid 140325422335744] (11)Resource temporarily unavailable: apr_thread_create: unable to create worker thread

Best Answer

The problem is caused by the TasksMax systemd attribute. It was introduced in systemd 228 and makes use of the cgroups pid subsystem, which was introduced in the linux kernel 4.3. A task limit of 512 is thus enabled in systemd if kernel 4.3 or newer is running. The feature is announced here and was introduced in this pull request and the default values were set by this pull request. After upgrading my kernel to 4.3, systemctl status docker displays a Tasks line:

# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/etc/systemd/system/docker.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2016-01-15 19:58:00 CET; 1min 52s ago
     Docs: https://docs.docker.com
 Main PID: 2770 (docker)
    Tasks: 502 (limit: 512)
   CGroup: /system.slice/docker.service

Setting TasksMax=infinity in the [Service] section of docker.service fixes the problem. docker.service is usually in /usr/share/systemd/system, but it can also be put/copied in /etc/systemd/system to avoid it being overridden by the package manager.

A pull request is increasing TasksMax for the docker example systemd files, and an Arch Linux bug report is trying to achieve the same for the package. There is some additional discussion going on on the Arch Linux Forum and in an Arch Linux bug report regarding lxc.

DefaultTasksMax can be used in the [Manager] section in /etc/systemd/system.conf (or /etc/systemd/user.conf for user-run services) to control the default value for TasksMax.

Systemd also applies a limit for programs run from a login-shell. These default to 4096 per user (will be increased to 12288) and are configured as UserTasksMax in the [Login] section of /etc/systemd/logind.conf.