R Script — to — MS SQL Server 2016 R Services Migration Guide

George Vyshnya
SBC Group Blog
Published in
5 min readJul 24, 2017

Overview

This post provides the step-by-step guide on how to transform an existing standalone R script into T-SQL stored procedures and functions running inside MS SQL Server 2016, utilizing its promising R Services in-database. Although there is a plenty of resources explaining certain aspects of using R code from inside T-SQL and Microsoft SQL Server 2016 R Services (see some of them in the last section of the post below), there is a lack of an integral end-to-end guideline on how to port your legacy R code there.

Why Considering Migration to Microsoft SQL Server 2016 R Services?

The convergence of the open-source ecosystem of R with the proprietary beast like Microsoft SQL Server has certain rationale, whatever strange it could look like at a first glance. There are at least several reasons that may be decisive factors to you.

Sticking to MS SQL Server Platform

If your organization/team has already used MS SQL Server as RDBMS source of data for both the transactional systems and machine learning applications, this is the natural next step. You can simply boost your efficiency by upgrading to MS SQL Server 2016 as well as benefit from in-database R capabilities and gaining real-time insights from your data.

Struggling With R Performance Bottlenecks

If you try to get the best of R beyond the simple rapid prototyping and exploratory data analysis, you are almost sure to be sick of performance bottlenecks in R in real-world business applications. A number of researches show R solutions to be 10–17 times slower than the comparable implementations in Python or Java (using the same hardware and OS configuration). Using MS SQL Server 2016 and revolutionary parallelization inside its R Services gives you a chance to leverage the cutting edge performance gain on your R applications.

DevOps Overheads

If you run your machine learning and predictive analytics applications in a real-world business environment, you know about the overwhelmingly high DevOps overheads to it. Industry experts started talking about Analytical Ops as a bridge between the classical data science and business operational framework/infrastructure. Even more, Data Scientists are encouraged to obtain Dev Ops qualifications in order to sustain their jobs in business in the long run.

The bottom line is all those data downloads/uploads, ETL, pre-processing routines, along with the need to constantly re-deploy your analytical applications in R drain the time from the dream work of any Data Scientist (that is, exploring data, looking for dependencies, engineering new features, building and tuning new cool machine learning algorithms etc.). With MS SQL Server 2016 and its R Services in-database, a lot of those overheads will be reduced.

Moving Out of Lab

If you look for a solid way to move your machine learning innovations from labs into real business analytics world, Microsoft SQL Server 2016 is going to be one the possible good solutions. With everything mentioned above, it can help you to embed your innovative technology as the integral part of the framework of the continuous business operations.

Note: With this all said, Microsoft SQL Server 2016 R services migration has its Cons, too. If you are strong believer in open-source predictive analytics, then porting your R applications to Python stack could be a sound alternative to bring your machine learning innovations to the business-ready scale.

Important Pre-requisites

If your decision to migrate to MS SQL Server 2016 R Services is solidified, there is a right time to plan for the implementation. Prior to migrating an existing standalone R script to T-SQL and MS SQL Server 2016 R Services, several configuration changes on your MS SQL Server 2016 instance should be taken care of. You act as follows

1. Check if your instance of MS SQL Server 2016 has R Services enabled (see the last section of http://www.desertislesql.com/wordpress1/?p=1282 to review one of the possible methods to do such a check)

2. In case R Services are disabled, go through the instructions in http://www.desertislesql.com/wordpress1/?p=1282 to enable it

3. Install all additional R packages needed by your script into R run-time environment of MS SQL Server 2016 R Services

a. You follow instructions in https://docs.microsoft.com/en-us/sql/advanced-analytics/r/install-additional-r-packages-on-sql-server to accomplish it

Note: it has been noted that MS SQL Server 2016 R Services work in a less then stable manner if your physical (or equally virtual) server runs both MS SQL Server 2016 Database Services (the basic database engine) and MS SQL Server 2016 Analysis Services. There are certain instable bugs in running external R scripts in such an environment. Therefore it is strongly recommended to only use MS SQL Server 2016 Database Services on a server where you are supposed to run T-SQL code with external R scripts embedded.

R Script Transformation

The general strategy of R script code migration to run from inside MS SQL Server 2016 R Services is as follows

1. Replace all of auxiliary R code that manages you data (data retrieval, data saving into the database, ETL, pre-processing etc.) with pure T-SQL store procedures and functions (for performance and efficiency reasons)

2. Wrap R code that performs mission-critical data science/machine learning functions (statistical analysis, model training and forecasting, clustering etc.) into a T-SQL stored procedure (or script) that uses sp_execute_external_script system stored procedure (https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql) to invoke R code

Would you like to know more?

There is a bunch of good resources available for you to quickly jump start on MS SQL Server 2016 R Services. There are a few of them below, to begin your journey with

1. Overview

a. Microsoft Machine Learning Services Overview: https://docs.microsoft.com/ru-ru/sql/advanced-analytics/r/r-services

b. Using R in MS SQL Server 2016 (Mar 24, 2016): http://www.desertislesql.com/wordpress1/?p=1282

2. DevOps aspects (Installation and Upgrade, R Package Management)

a. Overview on R Package Management for MS SQL Server 2016, R Services: https://docs.microsoft.com/en-us/sql/advanced-analytics/r/r-package-management-for-sql-server-r-services

b. Guide on installing new R packages to MS SQL Server 2016 R Services: https://docs.microsoft.com/en-us/sql/advanced-analytics/r/install-additional-r-packages-on-sql-server

c. Upgrade and Installation FAQ (SQL Server R Services): https://docs.microsoft.com/en-us/sql/advanced-analytics/r/upgrade-and-installation-faq-sql-server-r-services

3. Developer tips and guidelines

a. https://tomaztsql.wordpress.com/2016/06/20/passing-t-sql-select-statement-to-sp_execute_external_script/

b. https://tomaztsql.wordpress.com/2017/01/09/using-r-sp_execute_external_script-with-json/

c. https://www.mssqltips.com/sqlservertip/4747/sql-server-spexecuteexternalscript-stored-procedure-examples/

d. https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-execute-external-script-transact-sql

e. https://www.blue-granite.com/tutorials/sql-server-r-services

4. Essays of practioners

a. “Dirty data, flogged cores: Yes, Microsoft SQL Server R Services has its positives” — https://www.theregister.co.uk/2017/02/16/r_sql_server_great_but_beware/

b. Collection of lessons learned the hard way: R-script Migration to SQL Server 2016 — https://www.codeproject.com/Tips/1184282/R-script-Migration-to-SQL-Server

About the Author

George is a seasoned Data Scientist / Software Developer with young spirit and 20+ years of blended industrial experience in hands-on software development, IT, DevOps, business processes, operational / project management, and C-level role playing. He currently enjoys the mission of CTO to SBC Sp. z o o (www.sbc-group.pl). You can connect to him on

· LinkedIn: https://www.linkedin.com/in/gvyshnya/

· Twitter:

o https://twitter.com/SystemBC

o https://twitter.com/gvyshnya

· GitHub: https://github.com/gvyshnya

--

--

George Vyshnya
SBC Group Blog

Seasoned Data Scientist / Software Developer with blended experience in software development, IT, DevOps, PM and C-level roles. CTO at http://sbc-group.pl