Command Line – Sort and Count Line Occurrences

command linesort

I have Apache logfile, access.log, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]' is

a.php
b.php
a.php
c.php
d.php
b.php
a.php

the result that I want is:

3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php 

Best Answer

| sort | uniq -c

As stated in the comments.

Piping the output into sort organises the output into alphabetical/numerical order.

This is a requirement because uniq only matches on repeated lines, ie

a
b
a

If you use uniq on this text file, it will return the following:

a
b
a

This is because the two as are separated by the b - they are not consecutive lines. However if you first sort the data into alphabetical order first like

a
a
b

Then uniq will remove the repeating lines. The -c option of uniq counts the number of duplicates and provides output in the form:

2 a
1 b

References:

Related Question