Linux command to remove the duplicate lines but keep the first occurrence

command linelinuxstring manipulationUbuntu

I have a text file. Each line contains a string. Some strings are repeated. I want to remove repetition but I want to keep the first occurrence. For example:

line1
line1
line2
line3
line4
line3
line5

Should be

line1
line2
line3
line4
line5

I tried: sort file1 | uniq -u > file2 but this did not help. It removed all repeated strings while I want the first occurrence to be present. I do not need to sort. Just remove the exact repetition of a string in a new line while keeping everything else as it is.

Best Answer

If you allow sorting anyway, this will work:

sort | uniq

-u was the source of your trouble, because (from man 1 uniq):

-u, --unique
only print unique lines

while by default:

With no options, matching lines are merged to the first occurrence.

Related Question