Let's assume I have 3 machines with me.
My first machine has SQL Server 2016 with R Services installed on it.
My second machine has SQL Server 2016 without R services enabled on it.
My third machine has a standalone Microsoft R Server installed on it.
I understand that since my first machine has SQL Server with R Services, I can store R script as stored procedures on SQL Server and call it later for analysis.
My question lies with how my standalone Microsoft R Server connects to either SQL Servers (with and without R Services enabled).
From what I understand, it uses ODBC to connect to SQL Server and uses the RevoscaleR functions to query or analyze the SQL data – in both cases.
If that's the case,
- What benefit or limitation does my machine 3 (standalone MRS) have while connecting to SQL Server with R Services vs. SQL Server without R Services?
- In either cases, can I invoke any stored procedures from my R IDE remotely?
- Can I push the compute context to SQL server in both cases?
- Is there anything special that SQL Server with R Services offers for a standalone Microsoft R Server when compared to a SQL Server without R Services.
- Is there any difference in the database engine's ability to perform other tasks either way?
Imagine I'm reluctant to enable R Services on my SQL Server because it may take up the memory and slow down the performance of the database engine itself. All I want to do is get a standalone Microsoft R Server and connect it to SQL Server without R Services and get all the scalability and performance anyway.
What is it that I'm missing by doing this?
Best Answer
I can answer some of these questions hopefully:
you can invoke SQL Server stored procedures from R using the RODBC library, for example this worked for me:
Results:![using RODBC library](https://i.stack.imgur.com/zs42W.png)
ScaleR
library, eg functionRxInSqlServer
. See here for more details.sp_execute_external_script
from R using RODBC, but this seems a bit pointless. A dedicated SQL Server box (ie without R Services enabled) does not have to share its resources, and this kind of answers your last question I think; the point of having a dedicated box of either kind (MRS only or SQL Server only) would be so that neither has to share its resources. The point of having a SQL Server with R Services enabled would be so the box could perform dual roles, resources permitting, plus keeping the analytics close to the data. I hope that makes sense.Here's a nice article on persisting machine learning models to a table using SQL Server with R Services. How would you do that in standalone R?
Maybe someone else can chime in because honestly this seems very similar to your other questions. You might guess not that many people are doing this so one of the best people to do this is you; try these things out, report back : )