Download an article with cURL given a dynamic download link

curldownloadpdf

I'm trying to download this published journal article using cURL. It's the main page of an open access, so there should be not problems for anyone to see/download the article. I then extract the pdfurl, which keeps changing.

Then I try to download the pdf:

curl -L -o test.pdf "http://www.sciencedirect.com/science/article/pii/S0378426612000817/pdfft?md5=6a85f34def09dd5cfb1d1b8feded0d51&pid=1-s2.0-S0378426612000817-main.pdf"

but all the time it redirects me to the main page, which is then downloaded as a html page called "test.pdf".

Best Answer

curl seems to handle redirects differently from wget by default. The direct download URL will involve some redirects and it also requires the HTTP referer header to be set correctly after the first redirect (otherwise, you will get a HTML page).

First, you need to enable location redirects in curl with -L, and then enable curl's automatic handling of the referer header with --referer ";auto", that is,

curl -L --referer ";auto" -o test.pdf URL-for-direct-download
Related Question