Bash – How to Harden Bash Scripts Against Future Changes

bashrmshell-script

So, I deleted my home folder (or, more precisely, all files I had write access to). What happened is that I had

build="build"
...
rm -rf "${build}/"*
...
<do other things with $build>

in a bash script and, after no longer needing $build, removing the declaration and all its usages — but the rm. Bash happily expands to rm -rf /*. Yea.

I felt stupid, installed the backup, redid the work I lost. Trying to move past the shame.

Now, I wonder: what are techniques to write bash scripts so that such mistakes can't happen, or are at least less likely? For instance, had I written

FileUtils.rm_rf("#{build}/*")

in a Ruby script, the interpreter would have complained about build not being declared, so there the language protects me.

What I have considered in bash, besides corraling rm (which, as many answers in related questions mention, is not unproblematic):

rm -rf "./${build}/"*
That would have killed my current work (a Git repo) but nothing else.
A variant/parameterization of rm that requires interaction when acting outside of the current directory. (Could not find any.)
Similar effect.

Is that it, or are there other ways to write bash scripts that are "robust" in this sense?

Best Answer

set -u

set -o nounset

This would make the current shell treat expansions of unset variables as an error:

$ unset build
$ set -u
$ rm -rf "$build"/*
bash: build: unbound variable

set -u and set -o nounset are POSIX shell options.

An empty value would not trigger an error though.

For that, use

$ rm -rf "${build:?Error, variable is empty or unset}"/*
bash: build: Error, variable is empty or unset

The expansion of ${variable:?word} would expand to the value of variable unless it's empty or unset. If it's empty or unset, the word would be displayed on standard error and the shell would treat the expansion as an error (the command would not be executed, and if running in a non-interactive shell, this would terminate). Leaving the : out would trigger the error only for an unset value, just like under set -u.

${variable:?word} is a POSIX parameter expansion.

Neither of these would cause an interactive shell to terminate unless set -e (or set -o errexit) was also in effect. ${variable:?word} causes scripts to exit if the variable is empty or unset. set -u would cause a script to exit if used together with set -e.

As for your second question. There is no way to limit rm to not work outside of the current directory.

The GNU implementation of rm has a --one-file-system option that stops it from recursively delete mounted filesystems, but that's as close as I believe we can get without wrapping the rm call in a function that actually checks the arguments.

As a side note: ${build} is exactly equivalent to $build unless the expansion occurs as part of a string where the immediately following character is a valid character in a variable name, such as in "${build}x".

Related Solutions

Bash – measure amount of data read from /dev/random

If a little redirection is acceptable, then pv is a good way in general to achieve this type of thing, but GPG has (unsurprisingly) /dev/random hard-coded into it, so that's not going to work here without some hackery. On linux, using unshare to temporarily overlay /dev/random is probably the least disagreeable, though it requires root permissions :

mkfifo $HOME/rngfifo
pv -s 300 /dev/random > $HOME/rngfifo

pv will block until there's a reader on the fifo. Then as root or via sudo:

unshare -m -- sh -c "mount --bind $HOME/rngfifo /dev/random && gpg --gen-key [...]"

One obvious possible useful source of data is the random device driver itself (drivers/char/random.c). It supports a "debug" parameter, but sadly in the versions I've checked it's if-defined out (#if 0, 2.6.x and 3.4.x), and has been removed completely in recent kernels in favour of ftrace support. The driver makes an ftrace call (trace_extract_entropy()) each time data is read. For this, it seems overkill to me, as does systemtap, and the other tracing and debugging options (PDF).

A simple (but unappealing to most) option is to use an injected library to wrap the relevant open() and read() calls at the libc interface, similar to the solution to this question: Dynamic file content generation: Satisfying a 'file open' by a 'process execution' . If you wrap open64() are arrange for it to cache the descriptor when /dev/random is opened you can log the size of each read().

To help get the entropy rolling in, I highly recommend asciipacman ;-)

Bash: How to Use Multiple .bashrc Scripts

The keyword for this is source (or simply .) instead of include.

Add this to your .bashrc:

# Include more scripts
source /path/to/ide.bashrc
source /path/to/git.bashrc

# Include more scripts
. /path/to/ide.bashrc
. /path/to/git.bashrc

or include all from one directory:

if [ -d /path/to/includes ]; then
    for f in /path/to/includes/*.bashrc; do
        . "$f"
    done
fi

Read:

Best Answer

Related Solutions

Bash – measure amount of data read from /dev/random

Bash: How to Use Multiple .bashrc Scripts

Related Question