Bash – Loop over lines in file and subtract previous line from current line

bashscriptingshell-script

I have a file that contains some numbers

$ cat file.dat
0.092593
0.048631
0.027957
0.030699
0.026250
0.038156
0.011823
0.013284
0.024529
0.022498
0.013217
0.007105
0.018916
0.014079

I want to make a new file that contains the difference of the current line with the previous line. Expected output should be

$ cat newfile.dat
-0.043962
-0.020674
0.002742
-0.004449
0.011906
-0.026333
0.001461
0.011245
-0.002031
-0.009281
-0.006112
0.011811
-0.004837

Thinking this was trivial, I started with this piece of code

f="myfile.dat"    
while read line; do
    curr=$line
    prev=

    bc <<< "$line - $prev" >> newfile.dat
done < $f

but I realized quickly that I have no idea how to access the previous line in the file. I guess I also need to account for that no subtraction should take place when reading the first line. Any guidance on how to proceed is appreciated!

Best Answer

$ awk 'NR > 1 { print $0 - prev } { prev = $0 }' <file.dat
-0.043962
-0.020674
0.002742
-0.004449
0.011906
-0.026333
0.001461
0.011245
-0.002031
-0.009281
-0.006112
0.011811
-0.004837

Doing this in a shell loop calling bc is cumbersome. The above uses a simple awk script that reads the values off of the file one by one and for any line past the first one, it prints the difference as you describe.

The first block, NR > 1 { print $0 - prev }, conditionally prints the difference between this and the previous line if we've reached line two or further (NR is the number of records read so far, and a "record" is by default a line).

The second block, { prev = $0 }, unconditionally sets prev to the value on the current line.

Redirect the output to newfile.dat to save the result there:

$ awk 'NR > 1 { print $0 - prev } { prev = $0 }' <file.dat >newfile.dat

Related:


There was some mentioning of the slowness of calling bc in a loop. The following is a way of using a single invocation of bc to do the arithmetics while still reading the data in a shell loop (I would not actually recommend solving this problem in this way, and I'm only showing it here for people interested in co-processes in bash):

#!/bin/bash

coproc bc

{
    read prev

    while read number; do
        printf '%f - %f\n' "$number" "$prev" >&"${COPROC[1]}"
        prev=$number

        read -u "${COPROC[0]}" result
        printf '%f\n' "$result"
    done
} <file.dat >newfile.dat

kill "$COPROC_PID"

The value in ${COPROC[1]} is the standard input file descriptor of bc while ${COPROC[0]} is the standard output file descriptor of bc.

Related Question