Shell – Will $0 always include the path to the script

pathshell-script

I want to grep the current script so I can print out help and version information from the comments section at the top.

I was thinking of something like this:

grep '^#h ' -- "$0" | sed -e 's/#h //'

But then I wondered what would happen if the script was located in a directory that was in PATH and called without explicitly specifying the directory.

I searched for an explanation of the special variables and found the following descriptions of $0:

name of the current shell or program
filename of the current script
name of the script itself
command as it was run

None of these make it clear whether or not the value of $0 would include the directory if the script was invoked without it. The last one actually implies to me that it wouldn't.

Testing on My System (Bash 4.1)

I created an executable file in /usr/local/bin called scriptname with one line echo $0 and invoked it from different locations.

These are my results:

> cd /usr/local/bin/test
> ../scriptname
../scriptname

> cd /usr/local/bin
> ./scriptname
./scriptname

> cd /usr/local
> bin/scriptname
bin/scriptname

> cd /tmp
> /usr/local/bin/scriptname
/usr/local/bin/scriptname

> scriptname
/usr/local/bin/scriptname

In these tests, the value of $0 is always exactly how the script was invoked, except if it is invoked without any path component. In that case, the value of $0 is the absolute path. So that looks like it would be safe to pass to another command.

But then I came across a comment on Stack Overflow that confused me. The answer suggests using $(dirname $0) to get the directory of the current script. The comment (upvoted 7 times) says "that will not work if the script is in your path".

Questions

Is that comment correct?
Is the behavior different on other systems?
Are there situations where $0 would not include the directory?

Best Answer

In the most common cases, $0 will contain a path, absolute or relative to the script, so

script_path=$(readlink -e -- "$0")

(assuming there's a readlink command and it supports -e) generally is a good enough way to obtain the canonical absolute path to the script.

$0 is assigned from the argument specifying the script as passed to the interpreter.

For example, in:

the-shell -shell-options the/script its args

$0 gets the/script.

When you run:

the/script its args

Your shell will do a:

exec("the/script", ["the/script", "its", "args"])

If the script contains a #! /bin/sh - she-bang for instance, the system will transform that to:

exec("/bin/sh", ["/bin/sh" or "the/script", "-", "the/script", "its", "args"])

(if it doesn't contain a she-bang, or more generally if the system returns a ENOEXEC error, then it's your shell that will do the same thing)

There's an exception for setuid/setgid scripts on some systems, where the system will open the script on some fd x and run instead:

exec("/bin/sh", ["/bin/sh" or "the/script", "-", "/dev/fd/x", "its", "args"])

to avoid race conditions (in which case $0 will contain /dev/fd/x).

Now, you may argue that /dev/fd/x is a path to that script. Note however that if you read from $0, you'll break the script as you consume the input.

Now, there's a difference if the script command name as invoked doesn't contain a slash. In:

the-script its args

Your shell will look up the-script in $PATH. $PATH may contain absolute or relative (including the empty string) paths to some directories. For instance, if $PATH contains /bin:/usr/bin: and the-script is found in the current directory, the shell will do a:

exec("the-script", ["the-script", "its", "args"])

which will become:

exec("/bin/sh", ["/bin/sh" or "the-script", "-", "the-script", "its", "args"]

Or if it's found in /usr/bin:

exec("/usr/bin/the-script", ["the-script", "its", "args"])
exec("/bin/sh", ["/bin/sh" or "the-script" or "/usr/bin/the-script",
     "-", "/usr/bin/the-script", "its", "args")

In all those cases above except the setuid corner case, $0 will contain a path (absolute or relative) to the script.

Now, a script can also be called as:

the-interpreter the-script its args

When the-script as above doesn't contain slash characters, the behaviour varies slightly from shell to shell.

Old AT&T ksh implementations were actually looking up the-script unconditionally in $PATH (which was actually a bug and a security hole for setuid scripts), so $0 actually did not contain a path to the script unless the $PATH lookup actually happened to find the-script in the current directory.

Newer AT&T ksh would try and interpret the-script in the current directory if it's readable. If not it would lookup for a readable and executable the-script in $PATH.

For bash, it checks if the-script is in the current directory (and is not a broken symlink) and if not, lookup for a readable (not necessarily executable) the-script in $PATH.

zsh in sh emulation would do like bash except that if the-script is a broken symlink in the current directory, it would not search for a the-script in $PATH and would instead report an error.

All the other Bourne-like shells don't look the-script up in $PATH.

For all those shells anyway, if you find that $0 doesn't contain a / and is not readable, then it probably has been looked up in $PATH. Then, as files in $PATH are likely to be executable, it's probably a safe approximation to use command -v -- "$0" to find its path (though that wouldn't work if $0 happens to also be the name of a shell builtin or keyword (in most shells)).

So if you really want to cover for that case, you could write it:

progname=$0
[ -r "$progname" ] || progname=$(
    IFS=:; set -f
    for i in ${PATH-$(getconf PATH)}""; do
      case $i in
        "") p=$progname;;
        */) p=$i$progname;;
        *) p=$i/$progname
      esac
      [ -r "$p" ] && exec printf '%s\n' "$p"
    done
    exit 1
  ) && progname=$(readlink -e -- "$progname") ||
  progname=unknown

(the "" appended to $PATH is to preserve a trailing empty element with shells whose $IFS acts as delimiter instead of separator).

Now, there are more esoteric ways to invoke a script. One could do:

the-shell < the-script

Or:

cat the-script | the-shell

In that case, $0 will be the first argument (argv[0]) that the interpreter received (above the-shell, but that could be anything though generally either the basename or one path to that interpreter).

Detecting that you're in that situation based on the value of $0 is not reliable. You could look at the output of ps -o args= -p "$$" to get a clue. In the pipe case, there's no real way you can get back to a path to the script.

One could also do:

the-shell -c '. the-script' blah blih

Then, except in zsh (and some old implementation of the Bourne shell), $0 would be blah. Again, hard to get to the path of the script in those shells.

Or:

the-shell -c "$(cat the-script)" blah blih

etc.

To make sure you have the right $progname, you could search for a specific string in it like:

progname=$0
[ -r "$progname" ] || progname=$(
    IFS=:; set -f
    for i in ${PATH-$(getconf PATH)}:; do
      case $i in
        "") p=$progname;;
        */) p=$i$progname;;
        *) p=$i/$progname
      esac
      [ -r "$p" ] && exec printf '%s\n' "$p"
    done
    exit 1
  ) && progname=$(readlink -e -- "$progname") ||
  progname=unknown

[ -f "$progname" ] && grep -q 7YQLVVD3UIUDTA32LSE8U9UOHH < "$progname" ||
  progname=unknown

But again I don't think it's worth the effort.

Example

$ rpm -ql httpd| head -10
/etc/httpd
/etc/httpd/conf
/etc/httpd/conf.d
/etc/httpd/conf.d/README
/etc/httpd/conf.d/autoindex.conf
/etc/httpd/conf.d/userdir.conf
/etc/httpd/conf.d/welcome.conf
/etc/httpd/conf.modules.d
/etc/httpd/conf.modules.d/00-base.conf

I would suggest putting your executables in either /usr/bin or /usr/local/bin and rolling your own RPM. It's pretty trivial to do this and by managing your software deployment using an RPM you'll be able to label a bundle with a version number further easing the configuration management of your software as you deploy it.

Determining which RPMs are "mine"?

You can build your RPMs using some known information that could then be agreed upon prior to doing the building. I often build packages on systems that are owned by my domain so it's trivial to find RPMs by simply searching through all the RPMs that were built on host X.mydom.com.

Example

$ rpm -qi httpd
Name        : httpd
Version     : 2.4.7
Release     : 1.fc19
Architecture: x86_64
Install Date: Mon 17 Feb 2014 01:53:15 AM EST
Group       : System Environment/Daemons
Size        : 3865725
License     : ASL 2.0
Signature   : RSA/SHA256, Mon 27 Jan 2014 11:00:08 AM EST, Key ID 07477e65fb4b18e6
Source RPM  : httpd-2.4.7-1.fc19.src.rpm
Build Date  : Mon 27 Jan 2014 08:39:13 AM EST
Build Host  : buildvm-20.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : http://httpd.apache.org/
Summary     : Apache HTTP Server
Description :
The Apache HTTP Server is a powerful, efficient, and extensible
web server.

This would be the Build Host line within the RPMs.

The use of /usr/bin/company?

I would probably discourage the use of a location such as this. Mainly because it requires all your systems to have their $PATH augmented to include it and is non-standard. Customizing things has always been a "right of passage" for every wannabee Unix admin, but I always discourage it unless absolutely necessary.

The biggest issue with customization's like this is that they become a burden in both maintaining your environment and in bringing new people up to speed on how to use your environment.

Can I just get a list of files from RPM?

Yes you can achieve this but it will require 2 calls to RPM. The first will build a list of packages that were built on host X.mydom.com. After getting this list you'll need to re-call RPM querying for the files owned by each of these packages. You can achieve this using this one liner:

$ rpm -ql $(rpm -qa --queryformat "%-30{NAME}%{BUILDHOST}\n" | \
    grep X.mydom.com | awk '{print $1}') | head -10
/etc/pam.d/run_init
/etc/sestatus.conf
/usr/bin/secon
/usr/bin/semodule_deps
/usr/bin/semodule_expand
/usr/bin/semodule_link
/usr/bin/semodule_package
/usr/bin/semodule_unpackage
/usr/sbin/fixfiles
/usr/sbin/genhomedircon

Best Answer

Related Solutions

Why does root not have /usr/local in path

Where to put binaries so they are always in path and can be found easily

Example

Determining which RPMs are "mine"?

Example

The use of /usr/bin/company?

Can I just get a list of files from RPM?

Related Question