Solaris – Ulimit for Stack Size: Per Process or Per Thread Limit?

solarisulimit

So we have a program on Solaris which was running out of stack space.

While investigating this, I had a brief look at what ulimit was for the stack:

user@solaris-box:~$ ulimit -a
...
stack size              (kbytes, -s) 8192

So the stack size limit is 8 megabytes. But is this the limit for the whole process?

What if my process has 10 threads, are they are only allowed 819k per thread? (or some mix thereof, up to 8MiB?)

I can't find any doco on this.

Best Answer

SUMMARY

For the main thread, you have to call setrlimit() (perhaps by using ulimit) before the process is started in order to ensure the larger stack size is effective.

For threads started by the process, you need to use pthread_attr_setstacksize() as the thread stacksize is not affected by the stack size resource limit from setrlimit()/getrlimit() at all.

The code needs to look something like this:

pthread_attr attr;
pthread_attr_init( &attr );

// 32MB stack size example - should **NOT** hardcode this
// but get it from an environment variable or property setting
size_t stacksize = 32UL * 1024UL * 1024UL;
pthread_attr_setstacksize( &attr, stacksize );

pthread_create( &tid, &attr, start_func, thread_arg );

You can get the thread stack size from the current stack size limit:

struct rlimit limits;

getrlimit( RLIMIT_STACK, &limits );
size_t stacksize = limits.rlim_cur; // use rlim_max for hard limit

(Note that if you're using a library that creates its own threads, that library may have its own documented method of setting thread stack size, such as OpenMPI.)

DETAILED ANSWER

Resource limits are set from the command line by the ulimit utility.

If you run truss -f -a -vall -o /tmp/truss.out /usr/bin/ulimit -a, you'll see

address space limit (kbytes)   (-M)  unlimited
core file size (blocks)        (-c)  unlimited
cpu time (seconds)             (-t)  unlimited
data size (kbytes)             (-d)  unlimited
file size (blocks)             (-f)  unlimited
locks                          (-x)  not supported
locked address space (kbytes)  (-l)  not supported
message queue size (kbytes)    (-q)  not supported
nice                           (-e)  not supported
nofile                         (-n)  1024
nproc                          (-u)  29995
pipe buffer size (bytes)       (-p)  5120
max memory size (kbytes)       (-m)  not supported
rtprio                         (-r)  not supported
socket buffer size (bytes)     (-b)  5120
sigpend                        (-i)  128
stack size (kbytes)            (-s)  8192
swap size (kbytes)             (-w)  not supported
threads                        (-T)  not supported
process size (kbytes)          (-v)  unlimited

And if you look into /tmp/truss.out, you'll see

7752:   execve("/usr/bin/ulimit", 0xFFFF80FFBFFFF9E8, 0xFFFF80FFBFFFFA00)  argc = 2
7752:    argv: /usr/bin/ulimit -a
7752:   sysinfo(SI_MACHINE, "i86pc", 257)       = 6

  much deleted extraneous data (loading shared libraries, etc)...

7752:   getrlimit(RLIMIT_VMEM, 0xFFFF80FFBFFFD4B0)  = 0
7752:       cur = RLIM64_INFINITY  max = RLIM64_INFINITY
7752:   getrlimit(RLIMIT_CORE, 0xFFFF80FFBFFFD4B0)  = 0
7752:       cur = RLIM64_INFINITY  max = RLIM64_INFINITY
7752:   getrlimit(RLIMIT_CPU, 0xFFFF80FFBFFFD4B0)   = 0
7752:       cur = RLIM64_INFINITY  max = RLIM64_INFINITY
7752:   getrlimit(RLIMIT_DATA, 0xFFFF80FFBFFFD4B0)  = 0
7752:       cur = RLIM64_INFINITY  max = RLIM64_INFINITY
7752:   getrlimit(RLIMIT_FSIZE, 0xFFFF80FFBFFFD4B0) = 0
7752:       cur = RLIM64_INFINITY  max = RLIM64_INFINITY
7752:   getrlimit(RLIMIT_NOFILE, 0xFFFF80FFBFFFD4B0)    = 0
7752:       cur = 1024  max = 65536
7752:   sysconfig(_CONFIG_CHILD_MAX)            = 29995
7752:   pathconf("/", _PC_PIPE_BUF)         = 5120
7752:   pathconf("/", _PC_PIPE_BUF)         = 5120
7752:   sysconfig(_CONFIG_SIGQUEUE_MAX)         = 128
7752:   getrlimit(RLIMIT_STACK, 0xFFFF80FFBFFFD4B0) = 0
7752:       cur = 8388608  max = RLIM64_INFINITY
7752:   getrlimit(RLIMIT_VMEM, 0xFFFF80FFBFFFD4B0)  = 0
7752:       cur = RLIM64_INFINITY  max = RLIM64_INFINITY
7752:   write(1, " a d d r e s s   s p a c".., 942) = 942

We see that ulimit uses the getrlimit() (and setrlimit()) library functions to get/set resource limits.

Per the getrlimit()/setrlimit() man page (note the bolded parts):

RLIMIT_STACK

The maximum size of a process's stack in bytes. The system will not automatically grow the stack beyond this limit.

Within a process, setrlimit() will increase the limit on the size of your stack, but will not move current memory segments to allow for that growth. To guarantee that the process stack can grow to the limit, the limit must be altered prior to the execution of the process in which the new stack size is to be used.

Within a multithreaded process, setrlimit() has no impact on the stack size limit for the calling thread if the calling thread is not the main thread. A call to setrlimit() for RLIMIT_STACK impacts only the main thread's stack, and should be made only from the main thread, if at all.

The SIGSEGV signal is sent to the process. If the process is holding or ignoring SIGSEGV, or is catching SIGSEGV and has not made arrangements to use an alternate stack (see sigaltstack(2)), the disposition of SIGSEGV will be set to SIG_DFL before it is sent.

So, the stacksize for threads created by the process that are not the main thread is not affected by the process's RLIMIT_STACK resource limit. And you have to call setrlimit() BEFORE the process starts (in the parent process) in order to ensure that any larger stack size limit is actually effective.

Per the pthread_create() man page:

A new thread created with pthread_create() uses the stack specified by the stackaddr attribute, and the stack continues for the number of bytes specified by the stacksize attribute. By default, the stack size is 1 megabyte for 32-bit processes and 2 megabyte for 64-bit processes (see pthread_attr_setstacksize(3C)). If the default is used for both the stackaddr and stacksize attributes, pthread_create() creates a stack for the new thread with at least 1 megabyte for 32-bit processes and 2 megabyte for 64-bit processes. (For customizing stack sizes, see NOTES ).

...

Notes

...

A user-specified stack size must be greater than the value PTHREAD_STACK_MIN. A minimum stack size may not accommodate the stack frame for the user thread function start_func. If a stack size is specified, it must accommodate start_func requirements and the functions that it may call in turn, in addition to the minimum requirement.

It is usually very difficult to determine the runtime stack requirements for a thread. PTHREAD_STACK_MIN specifies how much stack storage is required to execute a NULL start_func. The total runtime requirements for stack storage are dependent on the storage required to do runtime linking, the amount of storage required by library runtimes (as printf()) that your thread calls. Since these storage parameters are not known before the program runs, it is best to use default stacks. If you know your runtime requirements or decide to use stacks that are larger than the default, then it makes sense to specify your own stacks.

Related Question