Ubuntu – How to make grep display only the matching regexp

grepregex

I am trying to find data from files using grep, but grep usually prints out the filename as well as the line where the term is found.

In this case, I am looking for a special regexp and I want grep to print all items matching the regexp, not the line or the filename. In other words, I want to see only the matched regular expression.

Best Answer

With GNU grep, that's the -o option possibly combined with -h to suppress the filename.

Related Solutions

Ubuntu – Find specific word in all files in and beneath current directory

The script below searches (text)files in a given directory recursively, for occurrences of a given string, no matter if it is in upper or lowercase, or any combination of those.

It will give you a list of found matches, the paths to the files, combined with the filenam and the actual occurrences of the string in the file, looking like:

/path/to/file1 ['numlock', 'numlocK']
/longer/path/to/file2 ['NuMlOck']

etc.

To limit the search time, I would look for matches in specific directories, so not for 2TB of files ;).

To use it:

1] Copy the text below, paste it into an empty textfile (gedit). 2] Edit the two lines in the headsection to define the string to look for and the directory to search. 3] Save it as searchfor.py. 4] To run it: open a terminal, type python3+space, then drag the script on to the terminalwindow and press return. The list of found matches will appear in the terminalwindow

In case of an error, the script will mention it.

#!/usr/bin/python3
import os
#-----------------------------------------------------
# give the searched word here in lowercase(!):
searchfor = "string_to_look_for"
# give the aimed directory here:
searchdir = "/path/to/search"
#-----------------------------------------------------
wordsize = len(searchfor)
unreadable = []
print("\nFound matches:")
for root, dirs, files in os.walk(searchdir, topdown=True):
    for name in files:
        file_subject = root+"/"+name
        try:
            with open(file_subject) as check_file:
                words = check_file.read()
                words_lower = words.lower()
                found_matches_list = [i for i in range(len(words_lower)) if words_lower.startswith(searchfor, i)]
                found_matches = [words[index:index+wordsize] for index in found_matches_list]
                if len(found_matches) != 0:
                    print(file_subject, found_matches)
                else:
                    pass
        except Exception:
            unreadable.append(file_subject)
if len(unreadable) != 0:
    print("\ncould not read the following files:")
    for item in unreadable:
        print("unreadable:", item)

Ubuntu – How to find all patterns between two characters

First of all, your grep -Po '"\K[^"]*' file idea fails because grep sees both "One" and ". the second is here" as being inside quotes. Personally, I'd probably just do

$ grep -oP '"[^"]+"' file | tr -d '"'
One
Two 
 Three 
Four

But that is two commands. To do it with a single command, you could use one of:

Perl
```
$ perl -lne '@F=/"\s*([^"]+)\s*"/g; print for @F' file 
One
Two 
Three 
Four
```
Here, the @F array holds all matches of the regex (a quote, followed by as many non-" as possible until the next "). The print for @F just means "print each element of @F.

Perl

$ perl -F'"' -lne 'for($i=1;$i<=$#F;$i+=2){print $F[$i]}' file 
One
Two 
 Three 
Four

To remove leading/trailing spaces from each match, use this:

perl -F'"' -lne 'for($i=1;$i<=$#F;$i+=2){$F[$i]=~s/^\s*|\s$//; print $F[$i]}' file

Here, Perl is behaving like awk. The -a switch causes it to automatically split input lines into fields on the character given by -F. Since I have given it ", the fields are:

$ perl -F'"' -lne 'for($i=0;$i<=$#F;$i++){print "Field $i: $F[$i]"}' file 
Field 0: first matched is 
Field 1: One
Field 2: . the second is here
Field 3: Two 
Field 0: and here are in second line
Field 1:  Three 
Field 2: 
Field 3: Four
Field 4: .

Because we are looking for text between two consecutive field separators, we know we want every second field. So, for($i=1;$i<=$#F;$i+=2){print $F[$i]} will print the ones we care about.

The same idea but in awk:

$ awk -F'"' '{for(i=2;i<=NF;i+=2){print $(i)}}' file 
One
Two 
 Three 
Four

Best Answer

Related Solutions

Ubuntu – Find specific word in all files in and beneath current directory

Ubuntu – How to find all patterns between two characters

Related Question