Shell – How to log calls using a wrapper script when there are multiple symlinks to the executable

shell-script

Long story short: I would like to track the way in which some executables are called to track some system behaviour. Let's say that I have an executable:

/usr/bin/do_stuff

And it is actually called by a number of different names via symlink:

/usr/bin/make_tea -> /usr/bin/do_stuff
/usr/bin/make_coffee -> /usr/bin/do_stuff

and so on. Clearly, do_stuff is going to use the first argument it receives to determine what action is actually takes, and the rest of the arguments will be handled in the light of that.

I would like to record ever call to /usr/bin/do_stuff (and the full list of arguments). If there were no symlinks, I would simply move do_stuff to do_stuff_real and write a script

#!/bin/sh
echo "$0 $@" >> logfile
/usr/bin/do_stuff_real "$@"

However, as I know that it will examine the name that it is called by, this won't work. How does one write a script to achieve the same but still pass on to do_stuff the right 'executable used name'?

For the record, to avoid answers on these lines:

I know that I can do it in C (using execve), but it would be a lot easier if I could, in this case, just use a shell script.
I can't simply replace do_stuff with a logging programme.

Best Answer

You often see this in case of utilities like busybox, a program that can provide most of the common unix utilities in one executable, that behaves different depending on its invocation/ busybox can do a whole lot of functions, acpid through zcat.

And it commonly decides what it's supposed to be doing by looking at it's argv[0] parameter to main(). And that shouldn't be a simple comparison. Because argv[0] might be something like sleep, or it might be /bin/sleep and it should decide to do the same thing. In other words, the path is going to make things more complex.

So if things were done by the worker program right, your logging wrapper could execute from something like /bin/realstuff/make_tea and if the worker looks at argv[0] basename only, then the right function should execute.

#!/bin/sh -
myexec=/tmp/MYEXEC$$
mybase=`basename -- "$0"`

echo "$0 $@" >> logfile

mkdir "$myexec" || exit
ln -fs /usr/bin/real/do_stuff "$myexec/$mybase" || exit
"$myexec/$mybase" "$@"
ret=$?
rm -rf "$myexec"
exit "$ret"

In the example above, argv[0] should read something like /tmp/MYEXEC4321/make_tea (if 4321 was the PID for the /bin/sh that ran)which should trigger the basename make_tea behavior

If you want argv[0] to be an exact copy of what it would be without the wrapper, you have a tougher problem. Because of absolute file paths beginning with /. You can't make a new /bin/sleep (absent chroot and I don't think you want to go there). As you note, you could do that with some flavor of exec(), but it wouldn't be a shell wrapper.

Have you considered using an alias to hit the logger and then start the base program instead of a script wrapper? It'd only catch a limited set of events, but maybe those are the only events you care about

Related Solutions

Bash – automatically load shell scripts from /usr/bin

The alias you have defined will only take effect once the alias command is actually called. Thus if you were to do this:

$ source myscript
$ e "Hello"

It should work. However, this is clearly not ideal since you want e to be available whenever you start bash. Luckily, bash provides a number of ways to automatically run commands when the shell is started. For the full details of how bash starts up, see the manual page. For our purposes, it is enough to know that ~/.bashrc (that is, a file named ".bashrc" in your home directory) is run as your shell starts up. One way to make your alias available at startup would be to add this line at the end of ~/.bashrc:

source myscript

However, if you were to do this for every alias you wanted, your /usr/bin folder would likely become a mess. And, if you are on a multi-user system, filling /usr/bin/ with scripts like this may cause other users problems as well. Thus, it is better to place your aliases right inside .bashrc and forgo the separate script all together. Since you are using Ubuntu, inside your .bashrc file you probably have something that looks like this:

if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases 
fi

This code looks for a file call .bash_aliases in your home directory and runs anything it finds in that file as well. If you have this, or if you add this code to your .bashrc, you could also put your alias in ~/.bash_aliases. This provides an easy way to keep all your aliases in one place and keep your .bashrc file uncluttered.

Shell – Will $0 always include the path to the script

In the most common cases, $0 will contain a path, absolute or relative to the script, so

script_path=$(readlink -e -- "$0")

(assuming there's a readlink command and it supports -e) generally is a good enough way to obtain the canonical absolute path to the script.

$0 is assigned from the argument specifying the script as passed to the interpreter.

For example, in:

the-shell -shell-options the/script its args

$0 gets the/script.

When you run:

the/script its args

Your shell will do a:

exec("the/script", ["the/script", "its", "args"])

If the script contains a #! /bin/sh - she-bang for instance, the system will transform that to:

exec("/bin/sh", ["/bin/sh" or "the/script", "-", "the/script", "its", "args"])

(if it doesn't contain a she-bang, or more generally if the system returns a ENOEXEC error, then it's your shell that will do the same thing)

There's an exception for setuid/setgid scripts on some systems, where the system will open the script on some fd x and run instead:

exec("/bin/sh", ["/bin/sh" or "the/script", "-", "/dev/fd/x", "its", "args"])

to avoid race conditions (in which case $0 will contain /dev/fd/x).

Now, you may argue that /dev/fd/x is a path to that script. Note however that if you read from $0, you'll break the script as you consume the input.

Now, there's a difference if the script command name as invoked doesn't contain a slash. In:

the-script its args

Your shell will look up the-script in $PATH. $PATH may contain absolute or relative (including the empty string) paths to some directories. For instance, if $PATH contains /bin:/usr/bin: and the-script is found in the current directory, the shell will do a:

exec("the-script", ["the-script", "its", "args"])

which will become:

exec("/bin/sh", ["/bin/sh" or "the-script", "-", "the-script", "its", "args"]

Or if it's found in /usr/bin:

exec("/usr/bin/the-script", ["the-script", "its", "args"])
exec("/bin/sh", ["/bin/sh" or "the-script" or "/usr/bin/the-script",
     "-", "/usr/bin/the-script", "its", "args")

In all those cases above except the setuid corner case, $0 will contain a path (absolute or relative) to the script.

Now, a script can also be called as:

the-interpreter the-script its args

When the-script as above doesn't contain slash characters, the behaviour varies slightly from shell to shell.

Old AT&T ksh implementations were actually looking up the-script unconditionally in $PATH (which was actually a bug and a security hole for setuid scripts), so $0 actually did not contain a path to the script unless the $PATH lookup actually happened to find the-script in the current directory.

Newer AT&T ksh would try and interpret the-script in the current directory if it's readable. If not it would lookup for a readable and executable the-script in $PATH.

For bash, it checks if the-script is in the current directory (and is not a broken symlink) and if not, lookup for a readable (not necessarily executable) the-script in $PATH.

zsh in sh emulation would do like bash except that if the-script is a broken symlink in the current directory, it would not search for a the-script in $PATH and would instead report an error.

All the other Bourne-like shells don't look the-script up in $PATH.

For all those shells anyway, if you find that $0 doesn't contain a / and is not readable, then it probably has been looked up in $PATH. Then, as files in $PATH are likely to be executable, it's probably a safe approximation to use command -v -- "$0" to find its path (though that wouldn't work if $0 happens to also be the name of a shell builtin or keyword (in most shells)).

So if you really want to cover for that case, you could write it:

progname=$0
[ -r "$progname" ] || progname=$(
    IFS=:; set -f
    for i in ${PATH-$(getconf PATH)}""; do
      case $i in
        "") p=$progname;;
        */) p=$i$progname;;
        *) p=$i/$progname
      esac
      [ -r "$p" ] && exec printf '%s\n' "$p"
    done
    exit 1
  ) && progname=$(readlink -e -- "$progname") ||
  progname=unknown

(the "" appended to $PATH is to preserve a trailing empty element with shells whose $IFS acts as delimiter instead of separator).

Now, there are more esoteric ways to invoke a script. One could do:

the-shell < the-script

Or:

cat the-script | the-shell

In that case, $0 will be the first argument (argv[0]) that the interpreter received (above the-shell, but that could be anything though generally either the basename or one path to that interpreter).

Detecting that you're in that situation based on the value of $0 is not reliable. You could look at the output of ps -o args= -p "$$" to get a clue. In the pipe case, there's no real way you can get back to a path to the script.

One could also do:

the-shell -c '. the-script' blah blih

Then, except in zsh (and some old implementation of the Bourne shell), $0 would be blah. Again, hard to get to the path of the script in those shells.

Or:

the-shell -c "$(cat the-script)" blah blih

etc.

To make sure you have the right $progname, you could search for a specific string in it like:

progname=$0
[ -r "$progname" ] || progname=$(
    IFS=:; set -f
    for i in ${PATH-$(getconf PATH)}:; do
      case $i in
        "") p=$progname;;
        */) p=$i$progname;;
        *) p=$i/$progname
      esac
      [ -r "$p" ] && exec printf '%s\n' "$p"
    done
    exit 1
  ) && progname=$(readlink -e -- "$progname") ||
  progname=unknown

[ -f "$progname" ] && grep -q 7YQLVVD3UIUDTA32LSE8U9UOHH < "$progname" ||
  progname=unknown

But again I don't think it's worth the effort.

Best Answer

Related Solutions

Bash – automatically load shell scripts from /usr/bin

Shell – Will $0 always include the path to the script

Related Question