I was under the impression that the maximum length of a single argument was not the problem here so much as the total size of the overall argument array plus the size of the environment, which is limited to ARG_MAX
. Thus I thought that something like the following would succeed:
env_size=$(cat /proc/$$/environ | wc -c)
(( arg_size = $(getconf ARG_MAX) - $env_size - 100 ))
/bin/echo $(tr -dc [:alnum:] </dev/urandom | head -c $arg_size) >/dev/null
With the - 100
being more than enough to account for the difference between the size of the environment in the shell and the echo
process. Instead I got the error:
bash: /bin/echo: Argument list too long
After playing around for a while, I found that the maximum was a full hex order of magnitude smaller:
/bin/echo \
$(tr -dc [:alnum:] </dev/urandom | head -c $(($(getconf ARG_MAX)/16-1))) \
>/dev/null
When the minus one is removed, the error returns. Seemingly the maximum for a single argument is actually ARG_MAX/16
and the -1
accounts for the null byte placed at the end of the string in the argument array.
Another issue is that when the argument is repeated, the total size of the argument array can be closer to ARG_MAX
, but still not quite there:
args=( $(tr -dc [:alnum:] </dev/urandom | head -c $(($(getconf ARG_MAX)/16-1))) )
for x in {1..14}; do
args+=( ${args[0]} )
done
/bin/echo "${args[@]}" "${args[0]:6534}" >/dev/null
Using "${args[0]:6533}"
here makes the last argument 1 byte longer and gives the Argument list too long
error. This difference is unlikely to be accounted for by the size of the environment given:
$ cat /proc/$$/environ | wc -c
1045
Questions:
- Is this correct behaviour, or is there a bug somewhere?
- If not, is this behaviour documented anywhere? Is there another parameter which defines the maximum for a single argument?
- Is this behaviour limited to Linux (or even particular versions of such)?
- What accounts for the additional ~5KB discrepancy between the actual maximum size of the argument array plus the approximate size of the environment and
ARG_MAX
?
Additional info:
uname -a
Linux graeme-rock 3.13-1-amd64 #1 SMP Debian 3.13.5-1 (2014-03-04) x86_64 GNU/Linux
Best Answer
Answers
The parameter which defines the maximum size for one argument is
MAX_ARG_STRLEN
. There is no documentation for this parameter other than the comments inbinfmts.h
:As is shown, Linux also has a (very large) limit on the number of arguments to a command.
A limit on the size of a single argument (which differs from the overall limit on arguments plus environment) does appear to be specific to Linux. This article gives a detailed comparison of
ARG_MAX
and equivalents on Unix like systems.MAX_ARG_STRLEN
is discussed for Linux, but there is no mention of any equivalent on any other systems.The above article also states that
MAX_ARG_STRLEN
was introduced in Linux 2.6.23, along with a number of other changes relating to command argument maximums (discussed below). The log/diff for the commit can be found here.It is still not clear what accounts for the additional discrepancy between the result of
getconf ARG_MAX
and the actual maximum possible size of arguments plus environment. Stephane Chazelas' related answer, suggests that part of the space is accounted for by pointers to each of the argument/environment strings. However, my own investigation suggests that these pointers are not created early in theexecve
system call when it may still return aE2BIG
error to the calling process (although pointers to eachargv
string are certainly created later).Also, the strings are contiguous in memory as far as I can see, so no memory gaps due do alignment here. Although is very likely to be a factor within whatever does use up the extra memory. Understanding what uses the extra space requires a more detailed knowledge of how the kernel allocates memory (which is useful knowledge to have, so I will investigate and update later).
ARG_MAX Confusion
Since the Linux 2.6.23 (as result of this commit), there have been changes to the way that command argument maximums are handled which makes Linux differ from other Unix-like systems. In addition to adding
MAX_ARG_STRLEN
andMAX_ARG_STRINGS
, the result ofgetconf ARG_MAX
now depends on the stack size and may be different fromARG_MAX
inlimits.h
.Normally the result of
getconf ARG_MAX
will be1/4
of the stack size. Consider the following inbash
usingulimit
to get the stack size:However, the above behaviour was changed slightly by this commit (added in Linux 2.6.25-rc4~121).
ARG_MAX
inlimits.h
now serves as a hard lower bound on the result ofgetconf ARG_MAX
. If the stack size is set such that1/4
of the stack size is less thanARG_MAX
inlimits.h
, then thelimits.h
value will be used:Note also that if the stack size set lower than the minimum possible
ARG_MAX
, then the size of the stack (RLIMIT_STACK
) becomes the upper limit of argument/environment size beforeE2BIG
is returned (althoughgetconf ARG_MAX
will still show the value inlimits.h
).A final thing to note is that if the kernel is built without
CONFIG_MMU
(support for memory management hardware), then the checking ofARG_MAX
is disabled, so the limit does not apply. AlthoughMAX_ARG_STRLEN
andMAX_ARG_STRINGS
still apply.Further Reading
ARG_MAX
(and equivalent) values on other Unix-like systems - http://www.in-ulm.de/~mascheck/various/argmax/MAX_ARG_STRLEN
caused a bug in with Automake which was embedding shell scripts into Makefiles usingsh -c
- http://www.mail-archive.com/bug-make@gnu.org/msg05522.html