When I download https://www.wired.com/category/security/
using either wget
or curl
, the result is gibberish/encrypted.
Is it possible (and if so what is the correct way) to save that web page (unencrypted / plain HTML) from the command line?
curlwget
When I download https://www.wired.com/category/security/
using either wget
or curl
, the result is gibberish/encrypted.
Is it possible (and if so what is the correct way) to save that web page (unencrypted / plain HTML) from the command line?
Best Answer
Executive summary:
It seems like the downloaded file is compressed and you should decompress it.
Detailed answer
Running:
Result with a downloaded
index.html
fileExecuting
file
command on the download file shows:Renaming the file and decompressing it turn it to be HTML document
Extra Info - why wget downloaded a compressed file?
As explained in How To Optimize Your Site With GZIP Compression:
Instead of downloading a large text file, modern HTTP server/clients uses Compressed HTTP Response which reduce the size of the transfered files.