Shell – Text files containing their own name

cshfindshell-script

I need to find in certain directory and all its' subdirectories all text files that contain their own name. How do I do this? (preferably without awk command)

Best Answer

To be maximally portable and work when filenames may include spaces or other characters from IFS, and on systems where find does not handle {} inside quotes, use find -exec in combination with sh and ordinary parameter passing:

find DIRECTORY -type f -exec sh -c 'grep -lFe "$(basename "$1")" "$1"' sh {} ';'

find DIRECTORY -type f enumerates all the ordinary files under DIRECTORY recursively. -exec ... ';' runs a command for each file it finds. The meat of it is the sh command through to {}, which:

  • Runs the standard shell with a script command string grep -lFe "$(basename "$1")" "$1" for every file matched, which:
    • Finds the base filename of the matched file ($1) using command substitution and basename - both the command substitution and the variable are quoted to account for the cases where they may contain blanks or wildcards; and
    • greps the matched file (still "$1") for that basename, using -l to print only the name of the matched file. "$1" is quoted again to account for blanks or wildcards.
  • And passes in two arguments to sh: "sh" and {}. sh is meant for the inline script's $0 (the name of the running program, which the shell uses for error reporting for instance). find will replace an argument consisting exactly of {} with the matched filename, so the sh script itself gets a single argument with the full file path available as $1.

This will work on any POSIX-compatible system. Using {} inside a quoted string with find -exec has implementation-defined behaviour, and so isn't portable (though it is supported by GNU grep and most of the BSDs) and is dangerous as the file name is interpreted as shell code.

Arbitrary filenames always need quoting when they may contain whitespace, wildcard characters or the value of the IFS input field separator variable is unknown; otherwise they will be split into multiple words and will make the program behave in unexpected ways.


More efficiently, you can use find -exec with + to run the minimum number of shells possible:

find DIRECTORY -type f -exec sh -c 'for x in "$@" ; do grep -lFe "$(basename "$x")" "$x" ; done' sh {} '+'

This is still portable, but provides many file paths to each execution of sh, so it doesn't start as many shells. Inside the shell program is a for loop that looks at each of the files. If this is something that's going to run frequently, this is a better option than using ";" to terminate the command.

Related Question