Sql-server – Bringing web service data into SQL server

importsql serverssisweb servicexml

I need to retrieve a data set daily from a hosted Web service. I can retrieve the XML "column names" in an SSIS package. But I need to retrieve all the data in the data set in csv format or bring directly into SQL Server. (Rows and columns, header data and dataset).

Is this feasible on SSIS? Or is there a better approach? Any input or advice is greatly appreciated.

Best Answer

Does this need to be done using SSIS? If so, you can do pretty much whatever you like in a Script Task (C# or VB.NET). That way you can bring the data directly into SSIS variables and use throughout the package.

Or, you can do this via SQLCLR, in which case you can call it directly from T-SQL. This lets you integrate it rather easily into Stored Procedures, SQL Agent Jobs, etc. You can even call it in a SQL Task in SSIS. BUT, you would need to be a bit more careful using SQLCLR as opposed to using a Script Task in SSIS. I have documented several things to consider across a few answers on Stack Overflow:

The most important points from those answers are:

  • Look at the following MSDN pages:

  • To escape URLs / Query Strings:
    • If you are using SQL Server 2005, 2008, or 2008 R2, then use Uri.EscapeDataString as it was available prior to .NET Framework v4.5
    • If you are using SQL Server 2012, 2014, or newer, then you can use either Uri.EscapeDataString or, if the server has been updated to at least .NET Framework v4.5, then you can alternatively use WebUtility.UrlEncode
  • If not already done, you will have to enable (one time) "CLR Integration" at the server level: Enabling CLR Integration
  • SQLCLR is "shared" code. There is a single App Domain for each DB + Owner combination. So all Sessions calling this code will be operating in the same App Domain. If multiple simultaneous requests are made to the same URI, all but the first 2 might be "on hold" since you are bound by the concurrent connection limit imposed by ServicePointManager ( ServicePointManager.DefaultConnectionLimit ). And the default limit is a whopping 2! Meaning, all additional requests to that URI, while there are already 2 active/open HttpWebRequests, will wait inline, patiently. You can increase this by setting the .ServicePoint.ConnectionLimit property of the HttpWebRequest object.

  • You will need to mark your Assembly as WITH PERMISSION_SET = EXTERNAL_ACCESS. Please don't be lazy and turn TRUSTWORTHY ON. That is an unnecessary security risk. The proper way to make this secure is to do the following:

    • Sign your Assembly
    • In the [master] database, create an Asymmetric Key from the DLL of your Assembly.
    • Also, in [master], create a Login from that Asymmetric Key
    • Grant your new Login the EXTERNAL ACCESS ASSEMBLY permission
    • Create your Assembly with a PERMISSION_SET of EXTERNAL_ACCESS, not UNSAFE

If you want to go down the SQLCLR path but don't want to mess with the coding, etc, then one option is to get a pre-built function. The SQL# SQLCLR library (that I wrote) has a function called INET_GetWebPages (not available in the Free version) that calls a URL (can pass in either GET or POST data) and passes back the response (and the response headers as a separate XML field).