Shell Script – Waiting for Network in Bash Script

networkingshell-scripttest

I'm running a script that relies on network being up and a network share be mounted. The script runs on login (which happens automatically after boot). The problem is that by the time the script runs, I usually do not have an IP address yet (DHCP). At the moment I just sleep the script for 15s, but I don't like this approach at all, since I want to be able to tell the user if something is wrong.

What my plan is, is loop while I don't have an IP address yet and continue when I do. Crucially, it has to time out after a while. What I came up with is to if [ ifconfig | grep "192.168.100" ]; but what happens is that grepconsumes the ]; and doesn't like it. Then bash also gets upset, because it can't find the ]; which grep ate. And then I haven't even implemented the time-out.

Someone suggested keeping a variable, and sleeping for, say, a second in each iteration and increase this variable each time. Here is my complete (non working) script (I'm fairly new to bash scripting):

x=0
while [ ifconfig | grep "192.168.100." > /dev/null ]; do
    echo "no nework"
    if "$x" -gt 200; then
        #Time out here
        exit 1
    x=$((x+1))
    sleep .1
    fi
done

#continue with rest of script...

Any pointers in the right direction would be greatly appreciated!

Best Answer

Shell syntax

You seem to be confused regarding conditionals in shell scripts. Every shell command has an exit status, which is an integer between 0 and 255, with 0 meaning success and any other value meaning failure. Statements like if and while that expect boolean operands inspect the exit status of the command and treat 0 (success) as true and any other value (failure) as false.

For example, the grep command returns 0 if the pattern is found and 1 if the pattern is not found. So

while ifconfig | grep "192.168.100." > /dev/null; do …

repeats the loop as long as the pattern 192.168.100. is found in the output of ifconfig. Note that the pattern 192.168.100. matches strings like 192x168 1007, because . in a regular expression matches any character; to search for a literal string, pass the option -F to grep. To invert the condition, put ! in front.

while ! ifconfig | grep -F "192.168.100." > /dev/null; do …

Further in the script, you want to compare the value of a variable to a number. You use the -gt operator, which is part of the syntax of the of conditional expressions understood by the test command. The test command returns 0 if the conditional expression is true and 1 if the conditional expression is false.

if test "$x" -gt 200; then

It is customary to use the alternate name [ for the test command. This name expects the command to end with the parameter ]. The two ways of writing this command are exactly equivalent.

if [ "$x" -gt 200 ]; then

Bash also offers a third way to write this command, with the special syntax [[ … ]]. This special syntax can support a few more operators than [, because [ is an ordinary command subject to the usual parsing rules, while [[ … ]] is part of the shell syntax.

Again, keep in mind that [ is for conditional expressions, which are a syntax with operators like -n, -gt, … [ doesn't mean “boolean value”: any command has a boolean value (exit status = 0?).

Detecting that the network is up

Your way of detecting that the network is up is not robust. In particular, note that your script will be triggered as soon as any network interface acquires an IP address within the specified range. In particular, it's quite possible that DNS won't be up yet at that point, let alone any network shares mounted.

Do you really need to run these commands when someone logs in? It's easier to make a command run automatically when the network is brought up. The way to do that depends on your distribution and whether you use NetworkManager.

If you need to run these commands as part of the login scripts, then test for the resource that you really need, not for the presence of an IP address. For example, if you want to test whether /net/somenode/somedir is mounted, use

while ! grep -q /net/somenode/somedir </proc/mounts; do
  sleep 1
done

If you have upstart or systemd…

then you can use it. For example, with Upstart, mark your job as start on net-device-up eth0 (replace eth0 by the name of the interface that provides the desired network connectivity). With Systemd, see Cause a script to execute after networking has started?

Related Solutions

Bash Script Execution – Fixing Stuck Bash Scripts

Here's the short and sweet - a heredoc is basically a file streamed to a file-descriptor.

Most people don't denote the 0<<descriptor and so you get it on <&0 stdin. ssh passes stdin to its invoked process so if you feed it a heredoc it will pass through the input to the invoked remote shell.

The one very special quality about heredocs is the difference between a \"'quoted and unquoted heredoc LIMITER. So, <<'THIS' differs from <<THIS. When you do not quote the LIMITER, the contents of the here-document are evaluated for ${shell:+expansion}. Once one ${shell:+expansion} pass is completed, there is very little else to distinguish a here-document from any other file fed as <~/input.

For example:

cat <<\QUOTED >~/file
    $(echo "This is ${NOT:-} expanded.")
#END
QUOTED
cat <~/file
> $(echo "This is ${NOT:-} expanded.")
> #END

But...

cat <<UNQUOTED >~/file
    $(echo "This is ${NOT:-} expanded.")
#END
UNQUOTED
cat <~/file
> This is  expanded.
> #END

You keep using the bash <<< herestring with cat. I don't know exactly how the herestring works but I'm willing to bet cat's already involved. So cat concatenates its <&0stdin with its stdout>&1. That's all it does. So you're unnecessarily complicating <<STDIN when you <<< cat it.

This can be a real problem if cat winds up consuming an input stream which you did not intend it to consume. Run just % cat at your terminal and it will look like nothing is happening because a terminal's stdin and stdout are the same file - your $(tty). But when they differ, cat combines them anyway and that can get pretty messy if you didn't mean it to happen.

It looks to me like some \'quotes are skipping an expansion when $(date) is $expanded. Then possibly the : null shell builtin is invoked and |piped to the next unquoted command after ssh which would be cat >> report.fail which should generate nothing at all in that file. So cat is >>appending /dev/null to report.fail, for as long as it can stand it, I guess. Or, more likely, for as long as exit allows ssh to carry-on proxying the null stream.

Also, have you checked to see if you've got a literal $TMP in your current working directory? I do see ENDSSH at the bottom which looks like a heredoc LIMITER to me so I believe either this is not the entire script or it has been edited by mistake. It would make sense if it were the body of a heredoc to use \$TMP, but as is I think nc will first >truncate then write its stdout to a file named $TMP. Then again, I guess you rm it anyway, so maybe you just didn't notice.

And because you're rming $TMP you may not have realized anyone would be asking this question:

How does $(mktemp) work for you without the filename.xxx argument?

UPDATE I looked closer at your output and definitely >/tmp/tmp.GJ1knZF5Jn means $(mktemp) is working - even the \$TMP part. So you've just taught me that I only need to specify mktemp .xxx if I specify a filename at all. Thank you.

Still I think there is a heredoc at the top somewhere? It could be there is not and \\ is only an attempt to deal with a side-effect of the echo \$TMP <<<herestring, but I dunno... Interesting.

I don't know if I've got this completely right because I don't know where all of these variables come from. But, this is close to how I would do this:

(which actually renders the last two questions irrelevant anyway)

_ssh() ( ssh "$1"@"$2" 'printf %s, `cat` >> '"$3"
) <<-PARAMS   
    "$lastSourceIP" 
    "$lastDestinationIP" 
    "$sourceFqdn" 
    "$fqdn" 
    "$port" "
    "ConnectivityNA" 
    "NA" 
    "NA" 
    "Pas" 
    "$(date)"
PARAMS

nc -z -v -n $lastDestinationIP $port |\
    grep -q "succeeded" && suffix=txt
_ssh user host report.${suffix:-fail}
unset suffix
?ENDSSH?

Note: the "$quotes" in the above are for printf's benefit on the other side of the ssh process - not for anything else. Those "quotes" remain as is even after all of the above PARAMS are evaluated.

There are some things going on up there that I've covered before. For instance the func() ( scope ) I like to think was covered ok here. The ${parameter:-expansion} was also covered there, but also demonstrated pretty well here and here. I've gotten into some of the weird heredoc stuff here, here, and here. Probably there are others - I guess I like messing up my shell or something.

In this case, though, using the function and the heredoc as I have, cat can't get stuck. PARAMS is sent over as stdin and cat will quit when it reaches an EOF (or CTRL-D) so as soon as it consumes PARAMS it's going to stop everytime. This is especially important if you are running this in a heredoc which is also on <&0 because PARAMS will stand in the way of cat eating your script mid-execute.

Anyway, hopefully that helps. If I've missed anything, don't hesitate to ask.

Bash – Terminate BASH For loop, that contains long sleep and other stuff, using any specific key

I've not had contact with this kind of tasks for a while, but I remember something like this used to work:

#!/bin/bash

trap break INT

for (( c=0; c<=$1; c++ ))
do  

# SOME STUFF HERE
#  data gathering into arrays and other commands here etc

    echo loop "$c" before sleep

    # sleep potentially for a long time
    sleep "$2"

    echo loop "$c" after sleep

done

#WRITE REPORT OUT TO SCREEN AND FILE HERE

echo outside

The idea is to use Ctrl-C to break the loop. This signal (SIGINT) is caught by the trap, which breaks the loop and lets the rest of the script follow.

Example:

$ ./s 3 1
loop 0 before sleep
loop 0 after sleep
loop 1 before sleep
^Coutside

Let me know if you have any problems with this.