Bash – How to do a grep on remote machine and print out the line which contains those words

bashgreplinuxshellssh

I have few logs files in my machineB under this directory /opt/ptd/Logs/ as shown below – My logs files are pretty big.

david@machineB:/opt/ptd/Logs$ ls -lt
-rw-r--r-- 1 david david  49651720 Oct 11 16:23 ptd.log
-rw-r--r-- 1 david david 104857728 Oct 10 07:55 ptd.log.1
-rw-r--r-- 1 david david 104857726 Oct 10 07:50 ptd.log.2

I am trying to write a generic shell script which should try to parse all my log file in machineB for a particular pattern and print the line which has those patterns. I will be running my below shell script from machineA which has all the ssh keys setup everything meaning I need to remotely grep on the logs files on machineB from machineA.

#!/bin/bash

wordsToInclude="hello,animal,atttribute,metadata"
wordsToExclude="timeout,runner"

# now grep on the various log file for above words and print out the lines accordingly

Meaning, I will have words separated by comma in wordsToInclude variable – If my logs contain hello word then print out that line, also print out the line which contains animal word. Similarly with attribute and metadata words.

And also I will have words separated by comma in wordsToExclude variable – If any of the lines contains those words then don't print out those line.

I am going with the above format for now for storing the words but any better format is fine to me. I can have long list of words in wordsToInclude and wordsToExclude variable so that's why I am going with storing them in those variables.

I know how to do a grep on small set of variables. If I need to do grep from the command line directly on machineB, then I will do it like this –

grep -E 'hello|animal|atttribute|metadata' ptd.log | grep -v 'timeout'

But I am not sure how do I combine this in my shell script so that I can do a remote ssh grep on machineB from machineA.

Best Answer

If you are open to other formats, consider:

inc="hello|animal|atttribute|metadata"
exc="timeout|runner" 
ssh machineB "grep -E '$inc' path/ptd.log | grep -vE '$exc'"

Faster Alternative

If your log files are large and you are grepping for fixed words, as opposed to fancy regular expressions, you may want to consider this approach:

inc='hello
animal
atttribute
metadata'

exc='timeout
runner'

ssh office "grep -F '$inc' ptd.log | grep -vF '$exc'"

By putting each word on a separate line, we can use grep's -F feature for fixed strings. This turns off regex processing, making the process faster.

Related Question