Linux – Can grep show context, but not a full line

command linegreplinux

I have a file with several very long lines. I want to grep for a string which may occur several times in the file, including possibly more than once on the one line.

$ cat 2014-11-03.json | grep 218

This produces unreadable output. There's just too much of it.

$ cat 2014-11-03.json | grep -o 218

This cuts down too much. It shows only the matched pattern without any context.

Basically, I want output like

... <category_id>218</category_id> ...

(Yes, this is XML, but I don’t want to parse XML. I just want to output the matched string with a few characters either side of it. Just a few characters, not the whole line.)

Grep seems to have options to show only the matched string, or the matched string in the context of its full line (the default behaviour), or the matched string in the context of a few lines before and after, but I cannot find an option to show the matched string in the context of a few characters before and after.

$ cat 2014-11-03.json | tr ' ' '\n' | grep 218 

That’s not ideal: it works so long as the file in question has spaces in roughly the right places. It worked for me this time, but there’s no guarantee it would again.

Best Answer

This question is old, but since I stumbled on it while looking for a way to grep only part of a line, here goes:

A workaround is to enable the option 'only-matching' and then to use RegExp's power to grep a bit more than your text:

grep -o ".\{0,50\}WHAT_I_M_SEARCHING.\{0,50\}" ./filepath

Of course, if you use color highlighting, you can always grep again to only color the real match:

grep -o ".\{0,50\}WHAT_I_M_SEARCHING.\{0,50\}"  ./filepath | grep "WHAT_I_M_SEARCHING"

Note:

  • this might not return all expected results if you have several matches per line: the .{0,50} might match part of the following match and thus prevent the matching.
  • This regex is slow. Very slow. (see comments for possible solution)
Related Question