I have a generator of files running, where each file has a name alphabetically following the previous one. At first I was doing my loop like for file in /path/to/files*; do...
, but I soon realized that the glob will only expand before the loop, and any new files created while looping won't be processed.
My current way of doing this is quite ugly:
while :; do
doneFileCount=$(wc -l < /tmp/results.csv)
i=0
for file in *; do
if [[ $((doneFileCount>i)) = 1 ]]; then
i=$((i+1))
continue
else
process-file "$file" # prints single line to stdout
i=$((i+1))
fi
done | tee -a /tmp/results.csv
done
Is there any simple way to loop over ever-increasing list of files, without the hack described above?
Best Answer
I think the usual way would be to have new files appear in one directory, and rename/move them to another after processing, so that they don't hit the same glob again. So something like this
Or similarly with a changing file extension:
On Linux, you could also use
inotifywait
to get notifications on new files.In either case, you'll want to watch for files that are still being written to. A large file created in-place will not appear atomically, but your script might start processing it when it's only halfway written.
The inotify
close_write
event above will see files when the writing process closes them (but it also catches modified files), while thecreate
event would see the file when it's first created (but it might still be written to).moved_to
simply catches files that are moved to the directory being watched.