Wget files by pattern only from specified directories recursively

wget

I need to download on an hourly basis (sometimes more frequently), files which are being written in segments of 24 hours. The files I am interested in are in specific subdirectories which I am trying to specify with -I list but this doesn't work for some reason.

If I don't specify directories the files I need download fine with the -A acclist option but I end up with lots of empty directories that are being created because they exist on the host.

my current line reads:

wget -np -nH --cut-dirs=X -c -N -r -l 0 \
     -I /dir1,/dir2,...,/some_dir -A acclist \
     http://hostname/X_sub_directories/

How do I download only the files I want and create only the directory hierarchy for those files?

Best Answer

you could add a post process command to wipe out the empty directories created.

wget -np -nH --cut-dirs=X -c -N -r -l 0 \
     -I /dir1,/dir2,...,/some_dir -A acclist \
     http://hostname/X_sub_directories/    \
     &&  find -depth -type d -empty -exec rmdir {} \;
Related Question