I want to archive a message board,I do that by using wget with the parameters: --page-requisites
, --span-hosts
, --convert-links
and --no-clobber
.
The problem is that using --convert-links
disables --no-clobber
. for every thread page, wget re-downloads site skins, scripts and icons(For the purpose of keeping them updated).
Is there a way to prevent wget from downloading files that already exist locally, referring links to files to their local copies and only downloading files that aren't already in the filesystem?
Best Answer
I believe if you include the switch
-N
it will forcewget
to make use of timestamps.With this switch,
wget
will only download files that it does not already have locally.Example
Download where file
robots.txt
doesn't already exist locally.Trying it a second time with the file
robots.txt
locally:Notice that the 2nd time,
wget
did not retrieve the file again.