Connecting Microsoft R Server to SQL Server with and without R Services

rsql serversql-server-2016

Let's assume I have 3 machines with me.
My first machine has SQL Server 2016 with R Services installed on it.
My second machine has SQL Server 2016 without R services enabled on it.
My third machine has a standalone Microsoft R Server installed on it.

I understand that since my first machine has SQL Server with R Services, I can store R script as stored procedures on SQL Server and call it later for analysis.

My question lies with how my standalone Microsoft R Server connects to either SQL Servers (with and without R Services enabled).
From what I understand, it uses ODBC to connect to SQL Server and uses the RevoscaleR functions to query or analyze the SQL data – in both cases.

If that's the case,

  • What benefit or limitation does my machine 3 (standalone MRS) have while connecting to SQL Server with R Services vs. SQL Server without R Services?
  • In either cases, can I invoke any stored procedures from my R IDE remotely?
  • Can I push the compute context to SQL server in both cases?
  • Is there anything special that SQL Server with R Services offers for a standalone Microsoft R Server when compared to a SQL Server without R Services.
  • Is there any difference in the database engine's ability to perform other tasks either way?

Imagine I'm reluctant to enable R Services on my SQL Server because it may take up the memory and slow down the performance of the database engine itself. All I want to do is get a standalone Microsoft R Server and connect it to SQL Server without R Services and get all the scalability and performance anyway.
What is it that I'm missing by doing this?

Best Answer

I can answer some of these questions hopefully:

  • The standalone machine has the benefit of most of the machine resources at its disposal. However as far as benefits "while connecting" go, there are none. The point of SQL Server with R services is to keep the analytics close to the data. This has little or no consequence for external clients connecting to the data, as I see it at least
  • you can invoke SQL Server stored procedures from R using the RODBC library, for example this worked for me:

    install.packages("RODBC")
    library(RODBC)
    dbhandle <- odbcDriverConnect('driver={SQL Server};server=.\\sql2014;database=tempdb;trusted_connection=true')
    data <-sqlQuery(dbhandle, "EXEC sp_who2")
    data
    odbcCloseAll()
    

Results: using RODBC library

  • push compute is a feature of the ScaleR library, eg function RxInSqlServer. See here for more details.
  • regarding "anything special that SQL Server with R Services offers for a standalone Microsoft R Server...", you could conceivably call the R stored procedure sp_execute_external_script from R using RODBC, but this seems a bit pointless. A dedicated SQL Server box (ie without R Services enabled) does not have to share its resources, and this kind of answers your last question I think; the point of having a dedicated box of either kind (MRS only or SQL Server only) would be so that neither has to share its resources. The point of having a SQL Server with R Services enabled would be so the box could perform dual roles, resources permitting, plus keeping the analytics close to the data. I hope that makes sense.

Here's a nice article on persisting machine learning models to a table using SQL Server with R Services. How would you do that in standalone R?

Maybe someone else can chime in because honestly this seems very similar to your other questions. You might guess not that many people are doing this so one of the best people to do this is you; try these things out, report back : )