Parsing Log Files with sed -e. Need to count unique class names

I have a file, let's call it filename.log, in it I have something like this

(2014-11-18 14:09:21,766), , xxxxxx.local, EventSystem, DEBUG FtpsFile delay secs is 5 [pool-3-thread-7] 
(2014-11-18 14:09:21,781), , xxxxxx.local, EventSystem, DEBUG FtpsFile disconnected from ftp server [pool-3-thread-7] 
(2014-11-18 14:09:21,798), , xxxxxx.local, EventSystem, DEBUG FtpsFile FTP File  Process@serverStatus on exit  - 113 [pool-3-thread-7] 
(2014-11-18 14:09:21,798), , xxxxxx.local, EventSystem, DEBUG FtpsFile FTP File  Process@serverStatus on exit  - 114 [pool-3-thread-7] 
(2014-11-18 14:09:21,799), , xxxxxx.local, EventSystem, DEBUG JobQueue $_Runnable Finally of consume() :: [pool-3-thread-7]

I am trying to find the classes the produce the most frequent DEBUG messages.

In this example you can see FtpsFile and JobQueue are two of the classes producing a message.

I have this

cat filename.log | sed -n -e 's/^.*\(DEBUG \)/\1/p' | sort | uniq -c | sort -rn | head -10

This will produce the class name and show me the most frequent classes as a top 10.

The problem is this does not give me the count of the class FtpsFile as 4. It counts each FtpsFile log file as a different unique entity.

How do I change the command above to basically say grab the first word after DEBUG and ignore the rest for your count?

Ideally I should get
4 FtpsFile
1 JobQueue

Parsing Log Files with sed -e. Need to count unique class names

Best Answer

Related Question

Best Answer

Related Solutions

Parsing log files for frequent IP’s

Text Processing – Extract Lines from Files with Specific Name Patterns

Related Question