Using grep to show entries that match a pattern and are present at least X times

greptext processing

I have a file that have entries like this among other lines

Feb 16 17:30:18 ns1 dovecot: pop3-login: Disconnected (auth failed, 1 attempts in 17 secs): user=<accountin@myserver.com>, method=PLAIN, rip=200.250.9.210, lip=10.10.10.10, session=<Sed519rVnADI+gnS>

Every time one line like this is found, I want to extract the IPs associated on the rip part but I want to extract IPs that shows at least 3 times.

I am trying to use grep to do this.

This is the grep I have

more /var/log/maillog-20130217 | grep "auth failed" | grep -o '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4
][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' 

this grep shows all IPs that are in the matching lines

How do I limit this grep to just show IPs if I have at least 3 matching lines with that IP and just unique IPs?

I mean this. Suppose my log has this

Feb 16 17:30:18 ns1 dovecot: pop3-login: Disconnected (auth failed, 1 attempts in 17 secs): user=<accountin@myserver.com>, method=PLAIN, rip=200.250.9.210, lip=10.10.10.10, session=<Sed519rVnADI+gnS>
Feb 16 17:30:18 ns1 dovecot: pop3-login: Disconnected (auth failed, 1 attempts in 17 secs): user=<accountin@myserver.com>, method=PLAIN, rip=200.250.9.210, lip=10.10.10.10, session=<Sed519rVnADI+gnS>
Feb 16 17:30:18 ns1 dovecot: pop3-login: Disconnected (auth failed, 1 attempts in 17 secs): user=<accountin@myserver.com>, method=PLAIN, rip=20.20.20.20, lip=10.10.10.10, session=<Sed519rVnADI+gnS>
Feb 16 17:30:18 ns1 dovecot: pop3-login: Disconnected (auth failed, 1 attempts in 17 secs): user=<accountin@myserver.com>, method=PLAIN, rip=200.250.9.210, lip=10.10.10.10, session=<Sed519rVnADI+gnS>

I would like to use grep and obtain 200.250.9.210 because I have 3 lines there with this IP, but not 20.20.20.20 that appears just one time.

What I have when I run my grep I have is this

200.250.9.210
200.250.9.210
20.20.20.20
200.250.9.210

or in other words, it is listing all IPs that are in matching lines.

thanks.

Best Answer

sed < mail.log -n 's/.*auth failed.*rip=\([^,]*\).*/\1/p' |
  sort |
  uniq -c |
  awk '$1 >= 3' |
  sort -rn

Would give you the matching IP addresses with their number of occurrence sorted by number of occurrence.

Related Question