Parsing log files for frequent IP’s

iplogstext processing

So, I hacked this together while undergoing a DDOS attack to pull naughty ips out of my logs. Anyone have any improvements or other suggestions to make it better?

Here's the general idea:

  1. pull ip's only out of log file
  2. sort them
  3. uniq and count them
  4. sort them again

And the string o'pipes:

cut --delim " " -f7 /var/log/apache_access | sort | uniq -c | sort -rn > sorted-ips.txt

Best Answer

I've always used this:

tail -1000 /var/log/apache_access | awk '{print $1}' | sort -nk1 | uniq -c | sort -nk1

With tail I'm able to set the limit of how far back I really want to go - good if you don't use log rotate (for whatever reason), second I'm making use of awk - since most logs are space delimited I've left my self with the ability to pull additional information out (possibly what URLs they were hitting, statuses, browsers, etc) by adding the appropriate $ variable. Lastly a flaw in uniq it only works in touching pairs - IE:

A
A
A
A
B
A
A

Will produce:

4 A
1 B
2 A

Not the desired output. So we sort the first column (in this case the ips, but we could sort other columns) then uniq them, finally sort the count ascending so I can see the highest offenders.

Related Question