Shell – Is piping, shifting, or parameter expansion more efficient

cutperformancepipeshell-script

I'm trying to find the most efficient way to iterate through certain values that are a consistent number of values away from each other in a space separated list of words(I don't want to use an array). For example,

list="1 ant bat 5 cat dingo 6 emu fish 9 gecko hare 15 i j"

So I want to be able to just iterate through list and only access 1,5,6,9 and 15.

EDIT: I should have made it clear that the values I'm trying to get from the list don't have to be different in format from the rest of the list. What makes them special is solely their position in the list(In this case, position 1,4,7…). So the list could be1 2 3 5 9 8 6 90 84 9 3 2 15 75 55 but I'd still want the same numbers. And also, I want to be able to do it assuming I don't know the length of the list.

The methods I've thought of so far are:

Method 1

set $list
found=false
find=9
count=1
while [ $count -lt $# ]; do
    if [ "${@:count:1}" -eq $find ]; then
    found=true
    break
    fi
    count=`expr $count + 3`
done

Method 2

set list
found=false
find=9
while [ $# ne 0 ]; do
    if [ $1 -eq $find ]; then
    found=true
    break
    fi
    shift 3
done

Method 3
I'm pretty sure piping makes this the worst option, but I was trying to find a method that doesn't use set, out of curiosity.

found=false
find=9
count=1
num=`echo $list | cut -d ' ' -f$count`
while [ -n "$num" ]; do
    if [ $num -eq $find ]; then
    found=true
    break
    fi
    count=`expr $count + 3`
    num=`echo $list | cut -d ' ' -f$count`
done

So what would be most efficient, or am I missing a simpler method?

Best Answer

Pretty simple with awk. This will get you the value of every fourth field for input of any length:

$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "\n" ) }' <<< $list
1 5 6 9 15

This works be leveraging built-in awk variables such as NF (the number of fields in the record), and doing some simple for looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.

Or, if you do indeed just want those specific fields as specified in your example:

$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15

As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time to show how long it takes; you could also use tools like strace to see how the system calls flow. Usage of time looks like:

$ time ./script.sh

real    0m0.025s
user    0m0.004s
sys     0m0.008s

You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.

Related Question