Bash – Remove files by regular expression

bashfilenamesregular expressionshellwildcards

I want to keep files whose names match [0-9A-Z]{1,2}_\d{4}_\w+?\.dat, for example, A1_2001_pm23aD.dat, K_1998_12.dat, and remove the rest.

However, the ls and rm commands do not support such regexes. How can I do this?

Best Answer

Using extended globs:

shopt -s extglob
printf '%s\n' !([[:digit:][:upper:]]?([[:digit:][:upper:]])_[[:digit:]][[:digit:]][[:digit:]][[:digit:]]_+([[:alnum:]]).dat)

this will print all file/directory names that do not (!) match [[:digit:][:upper:]] followed by zero or one [[:digit:][:upper:]] followed by 4 [[:digit:]] in between _s and then one or more [[:alnum:]] before the extension .dat.
If you want to search recursively:

shopt -s globstar
shopt -s extglob
printf '%s\n' **/!([[:digit:][:upper:]]?([[:digit:][:upper:]])_[[:digit:]][[:digit:]][[:digit:]][[:digit:]]_+([[:alnum:]]).dat)

Alternatively, with gnu find (you can use a regex):

find . -regextype egrep ! -regex '.*/[[:digit:][:upper:]]{1,2}_[[:digit:]]{4}_[[:alnum:]]+\.dat$'
Related Question