Is there a way to mimic the "Save – as" function in a browser with wget?
When I save a webpage in a browser, I get a folder with the assets(images, js, css) and the index file, which has the page name:
Nov 28 reddit: the front page of the internet_files
Nov 28 reddit: the front page of the internet.html
But no matter what, when I use wget, I get something like this:
Nov 28 a.thumbs.redditmedia.com
Nov 28 b.thumbs.redditmedia.com
Nov 28 m.reddit.com
Nov 28 out.reddit.com
Nov 28 reddit.com
Nov 28 www.reddit.com
Nov 28 www.redditstatic.com
I tried using these:
wget -E -H -k -K -p https://reddit.com
wget -r -x -mirror https://reddit.com
and came up with this:
wget -E -H -k -p -e robots=off https://www.reddit.com
but all of them either made several folders or didn't download everything needed to view the page offline.
How would I set this up?
Best Answer
You/WGET can't. It can download all linked resources in one given download, but that would result in multiple folders due to its nature of crawling, not interpreting (and not being bound to HTTP either).
Also your impression is too narrow: there are web browsers which can save pages into MHT files/archives, which is even a standard - see https://en.wikipedia.org/wiki/MHTML