Data Science for developers: You have a Predictive Model… now what? Expose and Consume an R Model

David Salgado
4 min readSep 4, 2016

--

Now is when the fun begins… on the previous post we created a basic predictive model, at this stage we have a functional model that would help us predict an estimated number of rentals for a given date. Now… how do we use it from our application? How can we call the ‘predict’ method from our NodeJS / C# / Python / Java / <your language here> application?

Let’s see few options

Option 1 — Re-code the R algorithm into your preferred language

You can implement the math behind the R model on your preferred language — hehe… you’re fun… no way

Option 2 — Expose the algorithm as a service

ok… doable, we can easily do it on-premises (with R Server) or in Azure, we even have a browser friendly ‘Machine Learning Studio’ that we can use to build experiments and expose them as service endpoints. This is not the case for this demo though :)

Let’s imagine that either: A — we can’t upload this data to the cloud (for any reason) or B — It would be a PIA (pain in the…) to keep that data up to date once it’s in the cloud (ETL Hell). So… let’s jump to the option 3.

Option 3 — Use the SQL Server R services

et voilá! one of the new SQL Server 2016 features is ‘R Services’, we can have the R script in the database itself, so no need to move data around, no problems with managing data bigger than your memory, and no problems with the traditional limitations of a (non-enterprise) R server. We’re going to bring the algorithm to the data :)

So let’s do it!

btw… the scripts that I reference below are part of the demo application used at this Build 2016 session, you can find the full app on GitHub, thanks to the terrific PlainConcepts team (ibonilm, pablodoval, and more! ) in collaboration with Tara Shankar and Pablo Castro.

Get the tools

you’ll need to download SQL Server 2016 (Developer edition is ok, now it’s free but you’d need to create a free DevEssentials account) and also enable R Services (just check steps 1 and 2).

Once you have R services working you will need to create a database and a table with the sample data we used on the previous post.

Creating and hosting the model

Well… it’s as easy as having the R script we created inside an stored procedure :) That stored procedure will execute the script that generates the predictive model and serialize that binary model into a table. You just need to run that stored procedure weekly or monthly to make sure that you keep your model trained with the latest data.

Apart from creating the stored procedure, you will need to create an additional table ‘Models’ with one column ‘Serialized’ : VARBINARY(MAX)

CREATE PROCEDURE TrainRentalModel AS
BEGIN
DECLARE @q NVARCHAR(MAX) = N'SELECT * FROM RentalFeatures ORDER BY Year, Month, Day'
DECLARE @s NVARCHAR(MAX) = N'
rentals = InputDataSet
rentals$FHoliday = factor(rentals$Holiday)
rentals$FSnow = factor(rentals$Snow)
rentals$FWeekDay = factor(rentals$WeekDay)
model = rpart(RentalCount ~ Month + Day + FWeekDay + FHoliday + FSnow, rentals)
serialized = data.frame(model = as.raw(serialize(model, NULL)))
' BEGIN TRY
BEGIN TRANSACTION
DELETE FROM Models
INSERT INTO Models
EXEC sp_execute_external_script @language = N'R', @script = @s, @input_data_1 = @q, @output_data_1_name = N'serialized'
COMMIT
END TRY
BEGIN CATCH
PRINT 'Failed to recompute model, rolling back to previous trained model'
ROLLBACK
END CATCH
END

see?

Exposing the ‘prediction’ endpoint

…and yes! this is just another stored procedure! It will de-serialize and execute the model with the parameters fed from the application layer.

CREATE PROCEDURE PredictRentals @q NVARCHAR(MAX) AS
BEGIN
DECLARE @serialized VARBINARY(MAX) = (SELECT TOP 1 Serialized FROM Models)
DECLARE @s NVARCHAR(MAX) = N'
rentals = InputDataSet
rentals$FHoliday = factor(rentals$Holiday)
rentals$FSnow = factor(rentals$Snow)
rentals$FWeekDay = factor(rentals$WeekDay)
model =
unserialize(as.raw(m))
p = predict(model, rentals)
OutputDataSet = data.frame(p)
'

EXEC sp_execute_external_script @language = N'R', @script = @s, @input_data_1 = @q, @params = N'@m VARBINARY(MAX)', @m = @serialized
WITH RESULT SETS ((Predicted FLOAT))
END

This is what we see if we execute the stored procedure directly from the management studio or Visual Studio. It predicts 638.

Connecting the app to the database

Easy! now you just need to connect your app to the database and call a stored procedure, that should be straightforward. Here you can see how it’s done from the .NET application used for the session. It creates a SqlCommand to invoke the stored procedure and sends a string with a query as the parameter.

string sqlInput = 
string.Format("SELECT CONVERT(INT, {0}) AS Month, CONVERT(INT, {1}) AS Day, CONVERT(INT, {2}) AS WeekDay, CONVERT(BIT, {3}) AS Snow, CONVERT(BIT, {4}) AS Holiday",
date.Month,
date.Day,
1 + (int)date.DayOfWeek,
snowedDayBefore ? 1 : 0,
holiday ? 1 : 0);
SqlCommand cmd = new SqlCommand();
cmd.CommandText = "PredictRentals";
cmd.CommandType = System.Data.CommandType.StoredProcedure; cmd.Parameters.Add(new SqlParameter("@q", sqlInput)); object r = await ExecuteSqlServerCommandAsync(cmd);

done!

Now every time that the function is called you will be able to show an estimated prediction for your users, you would just need to invest some time on keeping the model up to date.

Happy Hacking!

.ds

--

--

David Salgado

Product guy in tech. DevPlatforms, GameTech, UX & DS. Made in Spain, with Eusko Label. I miss the sun & los pintxos.