Sql-server – Is it possible to get SQL Server 2008 to download a file from a URL

sql-server-2008xml

I have a set of XML based services running on my network.

The data in these services needs to be mirrored into a Table on an SQL Server 2008 instance.

Getting the XML into a table is no problem I can already do that, but what I'm having to do at the moment is pass a huge long string of XML to a stored proc which then uses the OPENXML command to insert it into the table.

What I'm wanting to know, is instead of using a 3rd party program to get this data from the service then call the stored proc and insert it, is it in anyway possible for me to get SQL Server to just grab the XML directly from the service URL and process it like that.

All I can seem to find no matter how much I try is plenty of articles & posts on reading / writing files to / from the server file system and the several million or so on actually inserting the XML data, but I can't seem to find anything on getting the server to grab data directly from a URL.

Cheers

—–===== Update 25/8/2012 =====—–

After a little bit more research Iv'e found a third way to do this:

declare @xmlObject as int
declare @responseText as varchar(max)
declare @url as varchar(2048)

select @url = 'http://server/feed/data'

exec sp_OACreate 'MSXML2.XMLHTTP', @xmlObject OUT;
exec sp_OAMethod @xmlObject, 'open', NULL, 'get', @url, 'false'
exec sp_OAMethod @xmlObject, 'send'
exec sp_OAMethod @xmlObject, 'responsetext', @responseText OUTPUT
exec sp_OADestroy @xmlObject

select @responseText

exec sp_xml_preparedocument @idoc OUTPUT, @xmlData

Which seems to work, and work quite efficiently, however here's the weird thing.

If I use the full URL, which returns pure XML data with a mime type of 'text/xml' then response text contains nothing, it's null but if I strip it back to say 'http://server/feed/' or 'http://server/' so that the web server is just passing a 404 page or default html page, then I get the actual page content in.

At first I thought it might be the mime type, 'text/xml' vs 'text/html' but testing that made no difference, anything that returns valid XML seems to give null, but anything that returns broken XML seems to work!!

The code above however does work with correctly formatted XML from the internet, for example 'geonames' (which is what the original source of the above code was based on) works fine.

I suspect, that it's something to do with my feed server's config however, so need to do some work to resolve this, I thought I'd add the code here for others.

@Remus thank's for the suggestion, but that's how I currently do the task, I have a CLR binary that I wrote, that runs once a day to sync the feed to the DB, but it takes too long to sync the data. Using it to feed the XML is faster than doing a pure loop however, but the size of the XML can be quite variable (esp given that a binary would pass it to the stored proc using L2S) and Iv'e already overflowed the input to the SP a couple of times because there's been too much data, hence why I'm looking to get the SP to retrieve the data itself.

@Mr Browstone – SSIS seems to be the way to go, that's what other suggestions have been, I do however admit I don't know an awful lot about using SSIS other than to use the (Import / Export wizard – I'm more of a dev than a DBA 🙂 ) so any pointers on how I'd achieve this using SSIS would be helpful.

As for using CLR procs in the DB, I had thought of trying that, but Iv'e been bitten badly using these things before (And crashed a couple of servers too!!) so I'm a little wary of using them.

Best Answer

Use an external process that does the HTTP work and the inserts into the database. I explicitly advise against using SQLCLR for this. Hijacking precious SQL Server workers for the boring job of waiting for HTTP results will one day impact your server severely.

but the size of the XML can be quite variable (esp given that a binary would pass it to the stored proc using L2S) and Iv'e already overflowed the input to the SP a couple of times because there's been too much data

Use the techniques from DOWNLOAD AND UPLOAD IMAGES FROM SQL SERVER VIA ASP.NET MVC to stream the HTTP response into the database.