Using R in SQL Server 2016

One of the newest features available in SQL Server 2016 is the ability to use the R language. In this recent article b

One common scenario we saw with early adopter customers was as follows:

  1. The DBA team was in charge of installing SQL Server R Services. So they go through the instructions from Books Online (and sometimes may skip a step as people sometimes do) and then hand off the server to the data scientist and/or statistician.
  2. The data scientist tries using sp_execute_external_script or using pushed-compute context from their IDE-of-choice to the server and then immediate encounter an error message.

The recommendation in this case was to be sure to run a “reality check” test against the SQL Server instance in order to ensure SQL Server R Services is working properly.  For example:

EXECUTE   sp_execute_external_script
@language = N'R'
, @script = N'OutputDataSet <- InputDataSet'
, @input_data_1 = N'SELECT 1 as Col'
WITH RESULT SETS ((col int not null));
GO

This should return a result set with the value “1”.  If instead an error is raised, one valuable troubleshooting resource is as follows:

Known Issues for SQL Server R Services (Books Online)

For more difficult issues, we also asked customers to provide the files from the Log\ExtensibilityLog error log directory for SQL Server R Services under the SQL Server instance error log directory (you can identify this path by executing SELECT SERVERPROPERTY(‘ErrorLogFileName’)).

What packages are installed?

While SQL Server R Services comes out-of-the-box with R base packages and the RevoScaleR package, we found that most customers needed other packages installed in order to fully port their legacy code.  So another common lesson learned was to determine what packages were installed for SQL Server R Services.

The following query shows a list of installed packages on the SQL Server instance:

EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'
packagematrix <- installed.packages();
NameOnly <- packagematrix[,1];
OutputDataSet <- as.data.frame(NameOnly);'
WITH RESULT SETS((PackageName NVARCHAR (250)));

Alternatively, a data scientist can use an R IDE to check which packages are installed on the SQL Server instance (using that instance for the compute context):

con <- "Driver={SQL Server};Server= SQL1;Database=ContosoRetailDW;Trusted_Connection=true;"

sqlCompute <-
RxInSqlServer(
connectionString = con,
autoCleanup = TRUE,
numTasks = 8,
consoleOutput = TRUE
)
rxInstalledPackages(computeContext = sqlCompute)

Recommended reading on this topic:

Installing and Managing R Packages

Note: We found that some packages could be removed and de-referenced later in the overall testing process by replacing the originally-referenced functions with T-SQL or RevoScaleR equivalents. 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s