How to use regex inside exec with find

findregular expression

Is it possible to use regular expressions based on the result (file name) inside the exec argument with find? I want to be able to "exec" based on parts of the argument, like:

find . -name pattern -regex "foo (regex1) bar (Regex2)" -exec something $1 $2 ;

Best Answer

You can't use capture groups from the regexp in the command to execute. If you use find -regex to restrict matches, you'll have to do some extra matching in the command. You can do that by invoking a shell and using its own pattern matching constructs. For example, if foo and bar are constant strings and regex1 can't match bar:

find … -exec sh -c '
  x=${0#foo}
  y=${x#*bar}
  x=${x%%bar*}
  something "$x" "$y"
' {} \;

Invoking a shell has a little overhead. You can improve performance a bit by invoking the shell in batches.

find … -exec sh -c '
  for item do
    item=${item#foo}
    y=${item#*bar}
    x=${item%%bar*}
    something "$x" "$y"
  done
' sh {} +

Since you've already done some filtering, you may be able to get away with shell patterns that match more than regex1 and regex2, but, for paths of that particular form, match the same part. If foo and bar can't be expressed with ordinary shell patterns, you can invoke ksh or bash, which support extra patterns that are as powerful as regular expressions: @(alter|native), *(zero-or-more), +(one-or-more), ?(optional), and !(negated). In bash, these patterns need to be enabled with shopt -s extglob. In ksh, they are available natively.

In bash, there is a regular expression matching construct which you can use in conditionals: [[ $STRING =~ REGEXP ]]. The regexp is an ERE (like find -regextype posix-egrep). (Zsh has a similar one; ksh has =~ but doesn't expose capture groups.) Capture groups are available via the BASH_REMATCH array.

find … -exec bash -c '
  for item do
    [[ item =~ foo(regex1)bar(regex2) ]]
    something "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}"
  done
' bash {} +

An alternative approach is to print out the result and filter it, then call xargs to invoke the program. Arrange to have the first and second argument as successive items and run xargs -n 2. Use null bytes as separators to avoid xargs's strange quoting format, or use -d '\n' to use strict line-by-line parsing. Recent GNU tools such as sed can work with null bytes instead of newlines to separate records.

find … -print0 |
sed -z 's/^foo\(regex1\)bar\(regex2\)$/\1\x00\2/'
| xargs -n2 -0 something

An alternative approach is to ditch find and use the recursive globbing feature of ksh93, bash or zsh: **/ matches subdirectories recursively. This isn't possible for complex find expressions involving boolean connectors, but it's enough for most cases. For example, in bash (note that this recurses into symbolic links to directories, like find -L):

shopt -s extglob globstar
for x in **/*bar*; do
  if [[ item =~ foo(regex1)bar(regex2) ]]; then
    something "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}"
  fi
done

In zsh:

for x in **/*bar*; do
  if [[ item =~ foo(regex1)bar(regex2) ]]; then
    something $match[1] $match[2]
  fi
done

Related Solutions

Bash Find Command – How to Use Two Bash Commands in -exec

As for the find command, you can also just add more -exec commands in a row:

find . -name "*" -exec chgrp -v new_group '{}' \; -exec chmod -v 770 '{}' \;

Note that this command is, in its result, equivalent of using

chgrp -v new_group file && chmod -v 770 file

on each file.

All the find's parameters such as -name, -exec, -size and so on, are actually tests: find will continue to run them one by one as long as the entire chain so far has evaluated to true. So each consecutive -exec command is executed only if the previous ones returned true (i.e. 0 exit status of the commands). But find also understands logic operators such as or (-o) and not (!). Therefore, to use a chain of -exec tests regardless of the previous results, one would need to use something like this:

find . -name "*" \( -exec chgrp -v new_group {} \; -o -true \) -exec chmod -v 770 {} \;

Bash – can I do : find … -exec this && that

-exec is a predicate that runs a command (not a shell) and evaluates to true or false based on the outcome of the command (zero or non-zero exit status).

So:

find . -iname '*.csv' -exec grep foo {} \; -print

would print the file path if grep finds foo in the file. Instead of -print you can use another -exec predicate or any other predicate

find . -iname '*.csv' -exec grep foo {} \; -exec echo {} \;

See also the ! and -o find operators for negation and or.

Alternatively, you can start a shell as:

find . -iname '*.csv' -exec sh -c '
   grep foo "$1" && echo "$1"' sh {} \;

Or to avoid having to start a shell for every file:

find . -iname '*.csv' -exec sh -c '
  for i do
    grep foo "$i" && echo "$i"
  done' sh {} +

Best Answer

Related Solutions

Bash Find Command – How to Use Two Bash Commands in -exec

Bash – can I do : find … -exec this && that

Related Question