Filtering multi-lines from a log

Should this question be moved to stackoverflow instead?

I often need to read log files generated by java applications using log4j. Usually, a logged message (let's call it a log entry) spans over multiple lines. Example:

INFO  10:57:01.123 [Thread-1] [Logger1] This is a multi-line
text, two lines
DEBUG 10:57:01.234 [Thread-1] [Logger2] This entry takes 3 lines
line 2
line 3

Note that each log entry starts at a new line and the very first word from the line is TRACE, DEBUG, INFO or ERROR and at least one space.
Here, there are 2 log entry, the first at millisecond 123, the other at millisecond 234.

I would like a fast command (using a combination of sed/grep/awk/etc) to filter log entries (grep only filters lines), eg: remove all the log entries containing text 'Logger2'.

I considered doing the following transformations:

1) join lines belonging to the same log entries with a special sequence of chars (eg: ##); this way, all the log entries will take exactly one line

INFO  10:57:01.123 [Thread-1] [Logger1] This is a multi-line##text, two lines
DEBUG 10:57:01.234 [Thread-1] [Logger2] This entry takes 3 lines##line 2##line 3

2) grep
3) split the lines back (ie: replace ## with \n)

I had troubles at step 1 – I do not have enough experience with sed.

Perhaps the 3 steps above are not required, maybe sed can do all the work.

sed '/^INFO\|^DEBUG\|^TRACE\|^ERROR/{ /Logger2/{ :1 N /\nINFO\|\nDEBUG\|\nTRACE\|\nERROR/!s/\n// $!t1 D } }' log.entry

Best Answer

Related Question

Best Answer

Related Solutions

How to remove multiple blank lines from a file

Grep log and get text between log delimiters

Related Question