Result of diff two files with switched lines says missing the same line twice

diff()

I am trying to understand the linux diff command on two files whose lines are just permutation of each other but not able to grok the output that it generates. Consider the three commands below:

[myPrompt]$ cat file1
apples
oranges
[myPrompt]$ cat file2 
oranges
apples
[myPrompt]$ diff file1 file2
1d0
< apples
2a2
> apples

Can someone explain the above cryptic output from diff.

  1. Why there is no mention of "oranges" at all in the output?
  2. What does 1d0 and 2a2 mean?

I understand from this answer that :

"<" means the line is missing in file2 and ">" means the line is missing in
file1

BUT that doesn't explain why oranges is missing in the output.

Best Answer

To understand the report, remember that diff is prescriptive, describing what changes need to be made to the first file (file1) to make it the same as the second file (file2).

Specifically, the d in 1d0 means delete and the a in 2a2 means add.

Thus:

  • 1d0 means line 1 must be deleted in file1 (apples). 0 in 1d0 means line 0 is where they would have appeared in the second file (file2) had they not been deleted. That means when changing file2 to file1 (backwards) append line 1 of file1 after line 0 of file2.
  • 2a2 means append the second line (oranges) from file2 to the now second line of file1 (after deleting the first line in file1, oranges switched to line 1)