Bash scripting – read tarball from stdin

bashfifopipestdintar

I have a task I need to script that I feel should be stupidly simple, but I'm actually having a rather tough time.

I have a short bash script that takes a specific type of application in tarball form and builds it. Currently it just takes two command-line arguments: the name of the application and the location of the tarball. But I'm getting ready to deploy the build script on a multi-host system, and instead I would like this script to take the application tarball on stdin, which would be piped over ssh. For instance, here is how I would like to call the script:

ssh build-host "/usr/local/bin/build-app.sh myapp" < /tmp/myapp.tar.gz

When invoked like this, the build script would attempt to build an app called "myapp".

I want the build script to perform some sanity checks before it reads in the tarball. In addition, it should exit gracefully if the sanity checks fail. So here is the gist of my first attempt at the build script:

#!/bin/bash

appname=$1

# example sanity check
if [ ! -d "/apps/$appname" ]; then
    mkdir "/apps/$appname"
else
    echo "Error! The app already has been built."
    exit 1
fi

# more sanity checks...

# if passed all tests, then read stdin into tarfile
cat > "/apps/$appname/app.tar.gz"

# build app ...

This almost works. The problem is that I can't figure out how to read from stdin halfway through a bash script. The cat > file trick only works if I place it at the very first line of the script. But I don't want to do that. At the very first line of the script, none of the sanity checks have been run and there's no way to determine where to even put the tarfile. I might have multiple instances of the script running at once, and there wouldn't be a good way to avoid collisions.

(In addition to cat > file, I've also tried tricks like < /dev/stdin > file with very similar results.)

So I also tried the following:

#!/bin/bash

tarball="$(cat)"
appname="$1"

# example sanity check
if [ ! -d "/apps/$appname" ]; then
    mkdir "/apps/$appname"
else
    echo "Error! The app already has been built."
    exit 1
fi

# more sanity checks...

# if passed all tests, then read stdin into tarfile
echo -n "$tarball" > "/apps/$appname/app.tar.gz"

# build app ...

But this was also quite problematic, for two reasons. First, it didn't work. echo, even with the -n flag, doesn't seem to be designed for binary data, and the resulting tarfile was corrupted. Second, this requires storing the entire tarball in as a variable as the very first step. Not only was this slower than desired, even for the 5MB test tarball, but it could also mean high memory consumption, especially for the larger applications that I will likely be dealing with.

I like the idea of reading the tarfile from stdin. This would mean that the bash script can simply close the pipe if it feels it needs to abort, and it can allow the bash script to decide where to put all the application tarballs, as opposed to the original version of invoking the build script, which looked like this:

scp /tmp/myapp.tar.gz build-host:/tmp/
ssh build-host /usr/local/bin/build-app.sh myapp /tmp/myapp.tar.gz

This method of invocation is problematic for (hopefully) obvious reasons.

I'm really hoping I can get this bash script to work by reading stdin or some other form of pipe or buffer. Are there any ideas for how to get this to work?

Best Answer

The cat > file should work no matter when you call it, as long as you didn't read from it before and "use up" the input stream, or close the file descriptor, which you don't. This is a completely legal and normal command and it works, at least on my side.

Related Question