Bash – How to deal with optional input in shell script

bashparametershellshell-script

My assignment task is to print n longest lines from text files. The output is n longest lines with line number and in the same order as in the original file. For example, in the original file:

09876543
kbjkbkbbnbnmbnmnmmnbmnbmjbjkb
asjdsakdbakjsdbasbkj
asjdsakdbakjsdbasbkj
asjdsakdbakjsdbasbkj
sa
aaaa
njkasn
k
ppûunsdj
tieutuvi
eee
sdbhsdbjhdsvfdsvfgj
avavdvas
dfsdf
ffdsdfggdgdgdfgdfgdf112233
qwertyuiopsdfghjklxcvbnm,fghjk

If n is 10, the output should be

2 kbjkbkbbnbnmbnmnmmnbmnbmjbjkb
3 asjdsakdbakjsdbasbkj
4 asjdsakdbakjsdbasbkj
5 asjdsakdbakjsdbasbkj
10 ppûunsdj
11 tieutuvi
13 sdbhsdbjhdsvfdsvfgj
14 avavdvas
16 ffdsdfggdgdgdfgdfgdf112233
17 qwertyuiopsdfghjklxcvbnm,fghjk

If n is not assigned, the default number is 5. If there are at least 2 input files, each list of lines is preceded by the corresponding file name. How can I do that? How can I deal with parameter n?
For example, if parameter n must be entered, this code will work

awk '{ print length(), NR, $0 | "sort -rn" }' unix1.txt | head -n 10 | sed 's/[^ ]* //' | sort -n

But if n is an optional parameter, this doesn't work. I also don't know how to deal with many files.

awk '{ print length(), NR, $0 | "sort -rn" }' unix1.txt | head -n ${$1:-5} | sed 's/[^ ]* //' | sort -n >> temp.txt

Best Answer

I'm assuming your pipeline is part of a script that takes a single argument (the number of lines to return).

The parameter expansion ${$1:-5} is not valid and should be written as ${1:-5} to make it expand to 5 if $1 is empty or not set.

Also, your awk code invokes sort which is a bit obfuscated, in particular because it could easily be inserted in the pipeline as its own stage:

awk '{ print length(), NR, $0 }' unix1.txt | 
sort -rn |
head -n "${1:-5}" |
sed 's/[^ ]* //' |
sort -n

To simplify it a bit (replace the sed with cut) and to make the output a bit more "tabular", we can output the intermediate result from awk with tabs as the output delimiter:

awk -v OFS="\t" '{ print length(), NR, $0 }' unix1.txt | 
sort -rn |
head -n ${1:-5} |
cut -f 2- |
sort -n

For the given data, this would output the following for the default number of output lines:

2       kbjkbkbbnbnmbnmnmmnbmnbmjbjkb
4       asjdsakdbakjsdbasbkj
5       asjdsakdbakjsdbasbkj
16      ffdsdfggdgdgdfgdfgdf112233
17      qwertyuiopsdfghjklxcvbnm,fghjk

To handle multiple files in your script, I would suggest looping over the given filenames. Maybe something like

#!/bin/sh

n=${1:-5}
shift

for name do
    if [ "$#" -gt 1 ]; then
        printf 'File: %s\n' "$name"
    fi

    awk -v OFS="\t" '{ print length(), NR, $0 }' "$name" | 
    sort -rn |
    head -n "$n"
    cut -f 2- |
    sort -n
done

This script would be invoked as

./script.sh 10 file1 file2 file3 etc

Note that this requires that the first argument always is a number (and that there is nothing bash-specific in the script, which is why I use /bin/sh as the interpreter). To use proper command line options for giving the number to the script, e.g. as

./script -n 10 file1 file2

you will have to look into using getopts to do command line parsing. There are plenty of examples of that on this site (you could start by looking at the tag).

Related Question