Shell – Will $0 always include the path to the script

pathshell-script

I want to grep the current script so I can print out help and version information from the comments section at the top.

I was thinking of something like this:

grep '^#h ' -- "$0" | sed -e 's/#h //'

But then I wondered what would happen if the script was located in a directory that was in PATH and called without explicitly specifying the directory.

I searched for an explanation of the special variables and found the following descriptions of $0:

  • name of the current shell or program

  • filename of the current script

  • name of the script itself

  • command as it was run

None of these make it clear whether or not the value of $0 would include the directory if the script was invoked without it. The last one actually implies to me that it wouldn't.

Testing on My System (Bash 4.1)

I created an executable file in /usr/local/bin called scriptname with one line echo $0 and invoked it from different locations.

These are my results:

> cd /usr/local/bin/test
> ../scriptname
../scriptname

> cd /usr/local/bin
> ./scriptname
./scriptname

> cd /usr/local
> bin/scriptname
bin/scriptname

> cd /tmp
> /usr/local/bin/scriptname
/usr/local/bin/scriptname

> scriptname
/usr/local/bin/scriptname

In these tests, the value of $0 is always exactly how the script was invoked, except if it is invoked without any path component. In that case, the value of $0 is the absolute path. So that looks like it would be safe to pass to another command.

But then I came across a comment on Stack Overflow that confused me. The answer suggests using $(dirname $0) to get the directory of the current script. The comment (upvoted 7 times) says "that will not work if the script is in your path".

Questions

  • Is that comment correct?
  • Is the behavior different on other systems?
  • Are there situations where $0 would not include the directory?

Best Answer

In the most common cases, $0 will contain a path, absolute or relative to the script, so

script_path=$(readlink -e -- "$0")

(assuming there's a readlink command and it supports -e) generally is a good enough way to obtain the canonical absolute path to the script.

$0 is assigned from the argument specifying the script as passed to the interpreter.

For example, in:

the-shell -shell-options the/script its args

$0 gets the/script.

When you run:

the/script its args

Your shell will do a:

exec("the/script", ["the/script", "its", "args"])

If the script contains a #! /bin/sh - she-bang for instance, the system will transform that to:

exec("/bin/sh", ["/bin/sh" or "the/script", "-", "the/script", "its", "args"])

(if it doesn't contain a she-bang, or more generally if the system returns a ENOEXEC error, then it's your shell that will do the same thing)

There's an exception for setuid/setgid scripts on some systems, where the system will open the script on some fd x and run instead:

exec("/bin/sh", ["/bin/sh" or "the/script", "-", "/dev/fd/x", "its", "args"])

to avoid race conditions (in which case $0 will contain /dev/fd/x).

Now, you may argue that /dev/fd/x is a path to that script. Note however that if you read from $0, you'll break the script as you consume the input.

Now, there's a difference if the script command name as invoked doesn't contain a slash. In:

the-script its args

Your shell will look up the-script in $PATH. $PATH may contain absolute or relative (including the empty string) paths to some directories. For instance, if $PATH contains /bin:/usr/bin: and the-script is found in the current directory, the shell will do a:

exec("the-script", ["the-script", "its", "args"])

which will become:

exec("/bin/sh", ["/bin/sh" or "the-script", "-", "the-script", "its", "args"]

Or if it's found in /usr/bin:

exec("/usr/bin/the-script", ["the-script", "its", "args"])
exec("/bin/sh", ["/bin/sh" or "the-script" or "/usr/bin/the-script",
     "-", "/usr/bin/the-script", "its", "args")

In all those cases above except the setuid corner case, $0 will contain a path (absolute or relative) to the script.

Now, a script can also be called as:

the-interpreter the-script its args

When the-script as above doesn't contain slash characters, the behaviour varies slightly from shell to shell.

Old AT&T ksh implementations were actually looking up the-script unconditionally in $PATH (which was actually a bug and a security hole for setuid scripts), so $0 actually did not contain a path to the script unless the $PATH lookup actually happened to find the-script in the current directory.

Newer AT&T ksh would try and interpret the-script in the current directory if it's readable. If not it would lookup for a readable and executable the-script in $PATH.

For bash, it checks if the-script is in the current directory (and is not a broken symlink) and if not, lookup for a readable (not necessarily executable) the-script in $PATH.

zsh in sh emulation would do like bash except that if the-script is a broken symlink in the current directory, it would not search for a the-script in $PATH and would instead report an error.

All the other Bourne-like shells don't look the-script up in $PATH.

For all those shells anyway, if you find that $0 doesn't contain a / and is not readable, then it probably has been looked up in $PATH. Then, as files in $PATH are likely to be executable, it's probably a safe approximation to use command -v -- "$0" to find its path (though that wouldn't work if $0 happens to also be the name of a shell builtin or keyword (in most shells)).

So if you really want to cover for that case, you could write it:

progname=$0
[ -r "$progname" ] || progname=$(
    IFS=:; set -f
    for i in ${PATH-$(getconf PATH)}""; do
      case $i in
        "") p=$progname;;
        */) p=$i$progname;;
        *) p=$i/$progname
      esac
      [ -r "$p" ] && exec printf '%s\n' "$p"
    done
    exit 1
  ) && progname=$(readlink -e -- "$progname") ||
  progname=unknown

(the "" appended to $PATH is to preserve a trailing empty element with shells whose $IFS acts as delimiter instead of separator).

Now, there are more esoteric ways to invoke a script. One could do:

the-shell < the-script

Or:

cat the-script | the-shell

In that case, $0 will be the first argument (argv[0]) that the interpreter received (above the-shell, but that could be anything though generally either the basename or one path to that interpreter).

Detecting that you're in that situation based on the value of $0 is not reliable. You could look at the output of ps -o args= -p "$$" to get a clue. In the pipe case, there's no real way you can get back to a path to the script.

One could also do:

the-shell -c '. the-script' blah blih

Then, except in zsh (and some old implementation of the Bourne shell), $0 would be blah. Again, hard to get to the path of the script in those shells.

Or:

the-shell -c "$(cat the-script)" blah blih

etc.

To make sure you have the right $progname, you could search for a specific string in it like:

progname=$0
[ -r "$progname" ] || progname=$(
    IFS=:; set -f
    for i in ${PATH-$(getconf PATH)}:; do
      case $i in
        "") p=$progname;;
        */) p=$i$progname;;
        *) p=$i/$progname
      esac
      [ -r "$p" ] && exec printf '%s\n' "$p"
    done
    exit 1
  ) && progname=$(readlink -e -- "$progname") ||
  progname=unknown

[ -f "$progname" ] && grep -q 7YQLVVD3UIUDTA32LSE8U9UOHH < "$progname" ||
  progname=unknown

But again I don't think it's worth the effort.

Related Question