I have to grep through some JSON files in which the line lengths exceed a few thousand characters. How can I limit grep to display context up to N characters to the left and right of the match? Any tool other than grep would be fine as well, so long as it available in common Linux packages.
This would be example output, for the imaginary grep switch Ф:
$ grep -r foo *
hello.txt: Once upon a time a big foo came out of the woods.
$ grep -Ф 10 -r foo *
hello.txt: ime a big foo came of t
Best Answer
With GNU
grep
:Explanation:
-o
=> Print only what you matched-P
=> Use Perl-style regular expressions$N
characters followed byfoo
followed by 0 to$N
characters.If you don't have GNU
grep
:Explanation:
Since we can no longer rely on
grep
being GNUgrep
, we make use offind
to search for files recursively (the-r
action of GNUgrep
). For each file found, we execute the Perl snippet.Perl switches:
-n
Read the file line by line-l
Remove the newline at the end of each line and put it back when printing-e
Treat the following string as codeThe Perl snippet is doing essentially the same thing as
grep
. It starts by setting a variable$N
to the number of context characters you want. TheBEGIN{}
means this is executed only once at the start of execution not once for every line in every file.The statement executed for each line is to print the line if the regex substitution works.
The regex:
^.*?
) followed by.{0,$N}
as in thegrep
case, followed byfoo
followed by another.{0,$N}
and finally match any old thing lazily till the end of line (.*?$
).$ARGV:$1
.$ARGV
is a magical variable that holds the name of the current file being read.$1
is what the parens matched: the context in this case.foo
without failing to match (since.{0,$N}
is allowed to match zero times).1That is, prefer not to match anything unless this would cause the overall match to fail. In short, match as few characters as possible.