Shell – Awk – output the second line of a number of .dat files to one file

awkcommand lineio-redirectionshell-scripttext processing

I have multiple files something like: (in reality i have 80)

file1.dat

file2.dat

I want to end up with a file containing all of the second lines. i.e.

output.dat

6 9

8 4

What i have so far loops though the file names but then over-writes the file before it. e.g. the output of the above files would just be

8 4

my shell script looks like this:

post.sh

TEND = 80

TINDX = 0

while [ $TINDX - lt $TEND]; do

awk '{ print NR==2 "input-$TINDX.dat > output.dat

TINDX = $((TINDX+1))

done

Best Answer

Remove while loop and make use of shell brace expansion and also FNR, a built-in awk variable:

awk 'FNR==2{print $0 > "output.dat"}' file{1..80}.dat

Related Solutions

Merging 2 files with based on field match

$ awk 'FNR==NR{a[$1]=$2;next} ($1 in a) {print $1,a[$1],$2}' file2 file1
aa 45 32
bb 31 15
cc 50 78

Explanation:

awk implicitly loops through each file, one line at a time. Since we gave it file2 as the first argument, it is read first. file1 is read second.

FNR==NR{a[$1]=$2;next}

NR is the number of lines that awk has read so far and FNR is the number of lines that awk has read so far from the current file. Thus, if FNR==NR, we are still reading the first named file: file2. For every line in file2, we assign a[$1]=$2.

Here, a is an associative array and a[$1]=$2 means saving file2's second column, denoted $2, as a value in array a using file2's first column, $1, as the key.

next tells awk to skip the rest of the commands and start over with the next line.
($1 in a) {print $1,a[$1],$2}

If we get here, that means that we are reading the second file: file1. If we saw the first field of the line in file2, as determined by the contents of array a, then we print out a line with the values of field 2 from both files.

Awk – Print Number of Consonant Occurrences for Each File

With GNU awk for ENDFILE and IGNORECASE:

$ awk -v IGNORECASE=1 '
    { cnt += ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") )}
    ENDFILE { print FILENAME, cnt+0; cnt=0 }
' file1 file2
file1 12
file2 7

or with any POSIX awk:

$ awk '
    { lc=tolower($0); cnt[FILENAME] += (gsub(/[[:alpha:]]/,"&",lc) - gsub(/[aeiou]/,"&",lc)) }
    END { for (i=1; i<ARGC; i++) print ARGV[i], cnt[ARGV[i]]+0 }
' file1 file2
file1 12
file2 7

If you only want to count the specific characters b, c, d, etc. instead of all alphabetic characters that aren't aeiou, then just change ( gsub(/[[:alpha:]]/,"&") - gsub(/[aeiou]/,"&") ) above to gsub(/[bcdfghjklmnpqrtsvwxyz]/,"&"))

Note that, unlike any approach that prints results in an FNR==1 clause, both of the above scripts will handle empty files correctly by printing the file name and 0 as the count.

Also note the cnt+0 in the first script - the +0 ensures that the value printed will be a numeric 0 rather than a null string if the first file is empty.

If the same file name can appear multiple times in the input then add FNR==1{cnt[FILENAME]=0} to the start of the script if you want it output multiple times or add if (!seen[ARGV[i]]++) { ... } around the print in the END section if you only want it output once.

See https://unix.stackexchange.com/a/642372/133219 for an answer to the followup question of also counting vowels.

Best Answer

Related Solutions

Merging 2 files with based on field match

Awk – Print Number of Consonant Occurrences for Each File

Related Question