so I am trying to wget a specific webpage using this command in bash scripting:
wget --no-cookies --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" -O $2/content.html $1
And the result is that I get the bot page from the website because wget is reusing the existing connection (I think). This command was working before I spam tested and now my server is getting a bot test redirect from the site (can't use this).
--2017-12-12 19:16:42-- https://www.kayak.co.uk/h/bots/human-redirect.vtl?url=%2Fflights%2FDUB-LAX%2F2018-06-04%2F2018-06-25%2F2adults%3Fsort%3Dbestflight_a
Reusing existing connection to [www.kayak.co.uk]:443.
HTTP request sent, awaiting response... 200 OK
My question is: is there anyway to stop wget from using the existing connection and reconnect the site to download each time?
Best Answer
I know this is an old issue, but perhaps this will help others who come across it as I have.
To disable the "keep-alive" feature, use the
--no-http-keep-alive
argument.From the man page:
Using this argument is typically needed in cases where a new, clean request is necessary. Although not strictly related, the
--no-cache
and--no-cookies
arguments might also be relevant in cases where the--no-http-keep-alive
argument is used.So the OP's command would probably be: