You can't, portably, put more than one argument on a #!
line. That means only a full path and one argument (e.g. #!/bin/sed -f
or #!/usr/bin/sed -f
), or #!/usr/bin/env
and no argument to the interpreter.
A workaround to get a portable script is to use #!/bin/sh
and a shell wrapper, passing the sed script as a command-line argument. Note that this is not sanctioned by POSIX (multi-instruction scripts must be written with a separate -e
argument for each instruction for portability), but it works with many implementations.
#!/bin/sh
exec sed '
s/a/b/
' "$@"
For a long script, it may be more convenient to use a heredoc. An advantage of a heredoc is that you don't need to quote the single quotes inside, if any. A major downside is that the script is fed to sed on its standard input, with two annoying consequences. Some versions of sed require -f /dev/stdin
instead of -f -
, which is a problem for portability. Worse, the script can't act as a filter, because the standard input is the script and can't be the data.
#!/bin/sh
exec sed -f - -- "$@" <<'EOF'
s/a/b/
EOF
The downside of the heredoc can be remedied by a useful use of cat
. Since this puts the whole script on the command line again, it's not POSIX-compliant, but largely portable in practice.
#!/bin/sh
exec sed "$(cat <<'EOF')" -- "$@"
s/a/b/
EOF
Another workaround is to write a script that can be parsed both by sh and by sed. This is portable, reasonably efficient, just a little ugly.
#! /bin/sh
b ()
{
x
}
i\
f true; then exec sed -f "$0" "$@"; fi
: ()
# sed script starts here
s/a/b/
Explanations:
- Under sh: define a function called
b
; the contents don't matter as long as the function is syntactically well-formed (in particular, you can't have an empty function). Then if true (i.e. always), execute sed
on the script.
- Under sed: branch to the
()
label, then some well-formed input. Then an i
command, which has no effect because it's always skipped. Finally the ()
label followed by the useful part of the script.
- Tested under GNU sed, BusyBox and OpenBSD. (You can get away with something simpler on GNU sed, but OpenBSD sed is picky about the parts it skips.)
Objective Criteria/Requirements:
In determining whether to use an absolute or logical (/usr/bin/env
) path to an interpreter in a she-bang, there are (2) key considerations:
a) The interpreter can be found on target system
b) The correct version of interpreter can be found on target system
If we AGREE that "b)" is desirable, we also agree that:
c) It's preferable our scripts fail rather than execute using an incorrect interpreter version and potentially achieve inconsistent results.
If we DON'T AGREE that "b)" matters, then any interpreter found will suffice.
Testing:
Since using a logical path- /usr/bin/env
to the interpreter in the she-bang is the most extensible solution allowing the same script to execute successfully on target hosts with different paths to the same interpreter, we'll test it- using Python due to its' popularity- to see if it meets our criteria.
- Does
/usr/bin/env
live in a predictable, consistent location on POPULAR (not "every") Operating Systems? Yes:
- RHEL 7.5
- Ubuntu 18.04
- Raspbian 10 ("Buster")
- OSX 10.15.02
- Below Python script executed both inside and outside of virtual envelopes (Pipenv used) during tests:
#!/usr/bin/env pythonX.x
import sys
print(sys.version)
print('Hello, world!')
- The she-bang in the script was toggled by Python version number desired (all installed on same host):
- #!/usr/bin/env python2
- #!/usr/bin/env python2.7
- #!/usr/bin/env python3
- #!/usr/bin/env python3.5
- #!/usr/bin/env python3.6
- #!/usr/bin/env python3.7
Expected results: that print(sys.version)
= env pythonX.x
. Each time ./test1.py
was executed using a different installed Python version, the correct version specified in the she-bang was printed.
Testing Notes:
- Tests were exclusively limited to Python
- Perl: Like Python- MUST live in
/usr/bin
according to the FHS
- I've not tried every possible combination on every possible number of Linuxy/Unixy Operating System and version of each Operating System.
Conclusion:
Although it's TRUE that #!/usr/bin/env python
will use the first version of Python it finds in the user's Path, we can moderate this behaviour by specifying a version number such as #!/usr/bin/env pythonX.x
. Indeed, developers don't care which interpreter is found "first", all they care about is that their code is executed using the specified interpreter they know to be compatible with their code to ensure consistent results- wherever that may live in the filesystem...
In terms of portability/flexibility, using a logical- /usr/bin/env
- rather than absolute path not only meets requirements a), b) & c) from my testing with different versions of Python, but also has the benefit of fuzzy-logic finding the same version interpreter even if they live at different paths on different Operating Systems. And although MOST distros respect the FHS, not all do.
So where a script will FAIL if binary lives in different absolute path then specified in shebang, the same script using a logical path SUCCEEDS as it keeps going until it finds a match, thereby offering greater reliability & extensibility across platforms.
Best Answer
Shebang wasn't meant to be that flexible. There may be some cases where having a second parameter works, I think FreeBSD is one of them.
gawk and most utilities that come with the OS are expected to be in
/usr/bin/
.In the older UNIX days, it was common to have
/usr/
mounted over NFS or some less expensive media to save local disk space and cost per workstation./bin/
was supposed to have everything needed to boot in single user mode. Since/usr/
wasn't mounted on a reliable media,/bin/
included enough utilities to make it friendly enough for general administration and troubleshooting.This was inherited in Linux initially, but as disk space is no longer an issue and in most cases
/usr/
is in the root filesystem, the current trend is to move everything in/usr/bin
(at least in the Linux world). So most utilities installed by a distro are expected to be found there. Even the most basic utilities, likecp
,rm
,ls
etc (well, not yet).Regarding the shebang choice. Traditionally, this is something the admins or users have to edit according to their environment. For all a developer knows, in other people's systems, the interpreter could be anywhere in the filesystem (eg
/usr/local/bin
,/opt/gawk-4.0.1/bin
). Properly packaged scripts (rpm, deb etc) come with either a dependency on a distro package (ie. the interpreter has a known location) or a config script that setups the proper hashbang during installation.