I am trying to download some files from a services. The files are found in an XML file. The XML file can have a single file or several files to download. However, now I have a problem with my script. I do not know how to split string from XMLLINT into array so that I can download each file individually.
I need to split the string into several variables and then download each file of the URL string.
However the file 201701_1 do not repeat and hence, I download them using curl without any problems. But the files coverage.zip repeat and they become overwritten by curl.
I do:
Then I do curl to download individual files.
curl -O -b cookie $URL
At the moment, my script is as follows:
while read edition; do XML="<?xml version=\"1.0\"
encoding=\"UTF-8\"?> <download-area> <files>
<file>
<url>https://google.com/411/201701_01_01.zip</url>
</file>
<file>
<url>https://google.com/411/201701_01_02.zip</url>
</file> </files> </download-area>
"
URL=$(echo $XML | xmllint --xpath \
"/*[name()='download-area']/*[name()='files']/*[name()='file']/*[name()='url']/text()" -)
echo "URL:: " $URL
done < $LATEST_EDITION
LATEST_EDITION is a simply a file with lines.
My question is::
How can I split VAR_1 and VAR_2 into several URLs so that I can download them individually?
How can I prevent coverage.zip from being overwritten?
Best Answer
xmllint
is pretty useless to extract information from XML documents. You may want to considerxmlstarlet
orxml_grep
(fromperl
's XML::Twig) orxml2
.With
xmllint
, you could still extract one string at a time with:For values like here not containing newline characters, you can use
bash
'sreadarray
as:Or
Or: