Linux – Diff: How to only compare the first n characters in each line

bashcommand linediff()linux

I have two log files that are being generated from a decoded binary data. The decoders are slightly different, and I am trying to isolate the differences in the output. To do this, I am diffing the two log files, which works pretty well except that the time stamps are different for each line. For certain reasons, the differences in the time stamps is not relevant, so I want diff to ignore them.

Because the log files follow a specific format, I can simply exclude the last ~40 characters from each line to ignore the time stamps. EX:

Line A:

[T9] | ENTRY NAME                       varA             = 0000012B  varB             = 00000000 | 000015.508.107.113s | file.cpp              :738

Line B:

[T9] | ENTRY NAME                       varA             = 0000012B  varB             = 00000000 | 000015.508.107.163s | file.cpp              :738

These lines should be treated as identical in my case.

How can I tell diff to only include the first n characters from each line, or exclude the last m characters from each line?

Best Answer

In bash, you can use process substitution.

To remove last 40 characters, you can use

diff <(sed 's/.\{40\}$//' file1) \
     <(sed 's/.\{40\}$//' file1)

To select first 40, you can use

cut -c1-40 file
Related Question