What I want to do is, to monitor a directory (not recursive, just one) for new files created and append those files to one single big file as they are being written.
The number of files that are being written is huge, could reach as much as 50,000.
By using inotifywait
, I am monitoring the directory like:
inotifywait -m -e create ~/folder | awk '($2=="CREATE"){print $3}' > ~/output.file
So I am storing names of new files created in ~/output.file
and then using a for loop
for FILE in `cat ~/output.file`
do
cat $FILE >> ~/test.out
done
It works fine, if the rate at which a file is being written (created) in ~/folder
is like 1 file per second.
But the requirement is large, and the rate at which the files are being created is very high, like 500 files per minute (or even more).
I checked the number of files in the ~/folder
after the process is complete, but it does not match the inotifywait
output. There is a difference of like 10–15 files, varies.
Also, the loop
for FILE in `cat ~/output.file`
do
done
doesn't process all the files in ~/output.file
as they are being written.
Can anyone please suggest me an elegant solution to this problem?
Best Answer
Is there a particular reason you are using:
instead
inotifywait
options like--format
and--outfile
?If I run:
then open another tab,
cd
to~/folder
and run:(so I get much more than 500 files per minute) everything works fine and
output.file
contains all the50000
file names that I just created.Once the process has finished writing the files to disk you can append them to your
test.out
(assuming you are always in~/folder
):Or use
read
if you want to process files as they are created. So, while in~/folder
you could run:Note that in
inotifywait
stable,-m
and-t
cannot be used together. Support for usage of both switches has been recently added so if you buildinotify-tools
fromgit
you should be able to usemonitor
withtimeout
(to specify how long it has to wait for an appropriate event to occur before exiting). I've tested thegit
version on my system (exit if nocreate
events occur within 2 seconds) and it works fine: