Question about wget
, subfolder, and index.html.
Let's say I am inside "travels/" folder and this is in "website.com": "website.com/travels/".
Folder "travels/" contains a lot of files and other (sub)folders: "website.com/travels/list.doc" , "website.com/travels/cover.png" , "website.com/travels/[1990] America/" , "website.com/travels/[1994] Japan/", and so on…
How can I download solely all ".mov" and ".jpg" that resides in all the subfolders only? I don't want to pick files from "travels/" (e.g. not "website.com/travels/list.doc")
I found a wget
command (on Unix&Linux Exchange, I don't remember what was the discussion) capable of downloading from subfolders only their "index.html", not others contents. Why download only index files?
Best Answer
This command will download only images and movies from a given website:
According to wget man:
If you would like to download subfolders you need to use the flag
--no-parent
, something similar to this command:Regarding the index.html webpage. It will be excluded once the flag
-A
is included in the commandwget
, because this flag will forcewget
to download specific type of files, meaning ifhtml
is not included in the list of accepted files to be downloaded (i.e. flagA
), then it will not be downloaded andwget
will output in terminal the following message:wget
can download specific type of files e.g. (jpg, jpeg, png, mov, avi, mpeg, .... etc) when those files are exist in the URL link provided towget
for example:Let's say we would like to download .zip and .chd files from this website
In this link there are folders and .zip files (scroll to the end). Now, let's say we would like to run this command:
This command will download .zip files and at the same time it will create an empty folders for the .chd files.
In order to download the .chd files, we would need to extract the names of the empty folders, then convert those folder names to its actual URLs. Then, put all the URLs of interest in a text file
file.txt
, finally feed this text file towget
, as follows:The previous command will find all the chd files.