Debian – Deleting old files is slow and ‘kills’ IO performance

debianext4linux

I'm using find to prune old files, lots of them.. this takes minutes / hours to run and other server processes encounter IO performance issues.

find -mtime +100 -delete -print

I tried ionice but it didn't appear to help.

ionice -c 3 

What can one do to 1. speed up the find operation and 2. to avoid impacting other processes?
The FS is ext4.. is ext4 just bad at this kind of workload?
Kernel is 3.16
Storage is 2x 1TB 7200rpm HDDs in RAID 1.
There's 93GB in 610228 files now, so 152KB/file on average.

Maybe I just shouldn't store so many files in a single directory?

Best Answer

When you run the find command like you posted, it will do a rm for each file that it finds. This isn't a good way to do it, in terms of performance.

For improve this task, you can use the -exec option in find for process the output to a rm command:

find -mtime +100 -exec rm {} +

It's very important the use of the + termination instead the alternate \;. With +, find will only make a rm command for the maximum number of files it can process on a simple execution. With the \; termination, find will do a rm command for each file, so you would have the same problem.

For a better performance, you can join it to the ionice command like you mentioned. If you don't notice that it improves the system performance, most possible is that it is consuming other resources more than I/O, like CPU. For this, you can use renice command to decrease the priority in CPU usage of the process.

I would use the following:

ionice -c 3 find -mtime +100 -exec rm {} +

Now, in another shell, you need to find the PID of the find command: ps -ef | grep find

And finally run the renice command: renice +19 -p <PID_find_command>

Related Question