Given the following files:
data/A/a.txt
data/B/b.pdf
...
date/P/whatever.log
...
data/Z/z.jpg
I would like to delete all files in the data/A/
, data/B/
, …, data/Z/
directories except those files that are situated under one of the directories listed in the file data/dont_clean.txt
. For example, if we have data/P
listed in data/dont_clean.txt
then nothing should be touched under data/P/
, etc.
Something like:
find data/ -mindepth 2 -maxdepth 2 -type f -not -path {listed in data/dont_clean} -delete
Of course it is not a valid command.
I have also tried variants of
find data/ -mindepth 2 -maxdepth 2 -type f -exec grep data/dont_clean.txt '{}' \;
but I only created either an invalid command or I had no idea why I got the output I did.
I am using bash on Ubuntu 12.10
Best Answer
This is code that I only roughly tested but might layout an approach for you to take. Assuming you have a file,
ignore.txt
like this:Sample data
And I had sample directories with files in them like this:
Resulting in this:
Example run
Now if we run this command against this tree:
We can see that we're only getting back the files that are in directories not listed in
ignore.txt
.So we can add a
rm
to the end to remove the non-excluded files.Checking we can see that it worked:
Problems to be worked out
One big problem with this approach is that the strings in the
ignore.txt
file might match other portions of the directory structure. So some care needs to be paid to making sure that the strings in this file are unique in the way that you expect.Some blocking could be put around the strings so that they're anchored to the beginning or the end of the string to protect them.
Details
The above commands are doing the following:
dirs
igonre.txt
filexargs
to therm -f
command