Wget Directory Options

I have read the Wget manual, but unfortunately it does not seem to address my issue, so I would be most grateful if someone could offer me a bit of assistance.

We have a website, (say) website.com, which links directly to (say) website.com/1/, website.com/2/, … etc.

Now each page website.com/r/, where r is an integer, links to a number of pdf documents. Rather than them being located at website.com/r/doc-i.pdf – which would be convenient – they are all located at website.com/files/doc-i.pdf.

Thus, when I run the command wget -r -l 2 -A pdf website.com, I will of course end up with a big folder named "files", with all the pdf documents contained within it.

I would much prefer, however, that they be organised into different folders named 1, 2, …, n, that correspond to the page from which they were downloaded. Since I will be downloading in total around 10,000 pdf files, I would rather not have to do this manually.

So how do I tell Wget to organise the files, not by the website directory structure, but by the route in which it took to get to the file?

I hope my explanation is clear, and that this is not too difficult to achieve.

Best Answer

Related Question

Best Answer

Related Solutions

Ignore “other” domains when downloading with wget

Related Question