summary:
- A given system has lots text files with names
~=
[type of file].[8-digit date]
. - To search these files, I like (and wanna keep) using this idiom:
find /path/ -name 'file.nnnn*' -print | xargs -e fgrep -nH -e 'text I seek'
(wherennnn
== 4-digit year) - … and in the past decade I also made
find
glob across years likefind /path/ -name 'file.201[89]*' -print | xargs ...
- … but now I can't make
find
glob across 2019 and 2020 withfind /path/ -name 'file.20{19,20}*' -print | xargs ...
- … although that "curly-brace globbing" (correct term?) works fine with
ls
!
Is there a {concise, elegant} way to tell find
what I want, without instead doing post-find
cleanup (i.e., what I'm doing now) à la
find /path/ -name 'file.*' -print | grep -e '\.2019\|\.2020' | xargs ...
? FWIW, I'd prefer a solution that works with xargs
.
details:
I work on a system with lotsa conventions which long precede me and which I cannot change. One of those is, it has lotsa text files with names ~=
[type of file].[8-digit date]
, e.g., woohoo_log.20191230
. When searching within these files for some given text, I typically (as in, almost always) use the find ... grep
idiom (often using Emacs' M-x find-grep
). (FWIW, this is a Linux system with
$ find --version
find (GNU findutils) 4.4.2
...
$ bash --version
GNU bash, version 4.3.30(1)-release (x86_64-pc-linux-gnu)
and I currently lack status to change either of those, if I wanted to.) I often kinda know the year range of the matter-at-hand, and so will try to constrain what find
returns (to speed processing), with (e.g.)
find /path/ -type f -name 'file.nnnn*' -print | xargs -e fgrep -nH -e 'text I seek'
where nnnn
== 4-digit year. This WFM, and I like (and wanna keep) using the above idiom … especially since I can also use it to search across years like
find /path/ -type f -name 'file.201[89]*' -print | xargs ...
But this new decade seems to be breaking that idiom, and (to me at least) most oddly. (I wasn't here when the last decade changed.) Suppose I choose text that I know is in a file from 2019 && a file from 2020 (as in, I can open the files and see the text). If I currently do
find /path/ -name 'file.20{19,20}*' -print | xargs ...
grep
unexpectedly/annoyingly finishes with no matches found
, because
$ find /path/ -name 'file.20{19,20}*' -print | wc -l
0
But if I do
find /path/ -type f -name 'file.*' -print | grep -e '\.2019\|\.2020' | xargs ...
grep
returns the expected results. Which is nice, but … ummm … that's just ugly, esp since this "curly-brace glob" (please correct me if this usage is incorrect or otherwise deprecated) works from ls
! I.e., this shows me the files in the relevant year range (i.e., 2019..2020)
ls -al /path/file.20{19,20}*
Hence I'd like to know:
- Am I just not giving
find
the right glob for this usecase? What do I need to tellfind
to make it do whatls
is capably/correctly doing? - Is this a problem with
xargs
? If so, I can live with afind ... -exec
solution, but … my brain works better withxargs
, so I'd prefer to stay with that if possible. (Call me feebleminded, but-exec
's syntax makes my brain hurt.)
Best Answer
With
zsh
, you could use recursive globbing and its<x-y>
glob operator which matches on ranges of decimal numbers:(the
(D)
to also look into hidden (D
ot) dirs asfind
would; presumably you can omit it if you don't want them, and-.
is to restrict to regular file (.
) identified after symlink resolution (-
)).Note that it would also match on
file.00002020
(as that's a decimal number between 2019 and 2020) and like in your approach onfile.20201234
as itsfile.2020
which matchesfile.<2019-2020>
followed by1234
which matches*
.The standard (POSIX
sh
and utilities) way to do it would be with:(where adding
/dev/null
gets you the same effect as GNUgrep
's-H
to force the file name to be displayed)Note that the output of
find -print
is not compatible with the expected input format ofxargs
. With GNU utilities, you can usefind -print0
andxargs -r0
, but that's not needed asfind -exec ... {} +
has the same behaviour, is shorter and more portable.