The new Azure Machine Learning Services are Python-based. If you want to operationalize predictive models developed in R on Azure, there isn’t a straightforward way to do that.
Recently I read the following two articles:
- How to deploy a predictive service to Kubernetes with R and the AzureContainers package (R Bloggers)
- Real-time scoring of R machine learning models (Microsoft MSDN Blogs)
Both the above mentioned methods are related to the same architecture. But both the solutions are quite complex. They require the following technologies:
- Plumber or Microsoft Machine Learning Server
- AzureRMR package
- Azure Container Registry
- Azure Kubernetes Service (AKS)
Even if this architecture guarantees real-time scoring, customers that know Azure Machine Learning Studio and don’t need real-time scoring are quite often looking for a simple way to “directly” operationalize their custom R models previously developed on their preferred IDE, even if the service will be a little bit slower. So I was wondering if there was a simply way to deploy a predictive function in a web service on Azure.
Since the deployment process to a web service in Azure Machine Learning Studio is really simple and it is one of the platform’s point of strength, I started to investigate if I could use that feature.
As probably you already know, the Execute R Script module in Azure Machine Learning Studio allows to import the contents of a zip file in that execution context through the Script Bundle input:
Azure Machine Learning will unzip the archive passed to the Script Bundle input and the unzipped files will be available into the
That said, I found two ways (based on the same principle) to deploy a predictive function to an Azure Machine Learning Studio Web Service:
- The simplest way
- The structured way
The Simplest Way
First of all let’s create a custom model in RStudio. I’m getting the same data set and machine learning algorithm used in the above mentioned R Bloggers post.
Now you can find the Boston Housing random forest model saved as a rda file in your temp folder. Just zip and upload it in Azure Machine Learning Studio using the “Upload a new dataset from local file” feature:
Then create the following experiment:
Here the scripts I used for both the Execute R Script modules, the “Get the test dataframe” one and the “Predict from custom model” one respectively:
The result of the leftmost output of the “Predict” Execute R Script module gives us the 10 predictions deriving from the custom R model:
The Structured Way
If you didn’t know, RStudio allows you to create R packages. They are an ideal way to package and distribute R code and data for re-use by others. The idea is to build an R package to use as a template for predictions from custom R models in Azure Machine Learning Studio Web Services.
Prepare A New R Package
First of all, be sure to have a version of LaTex installed on your machine. I’m using a Windows VM, so I installed MiKTeX from this link (if you’re using a Mac, you can install MacTeX). After installing and updating the LaTeX packages, make sure that pdflatex.exe is found by RStudio adding the MiKTeX bin folder to the PATH environment variable. You can do that directly in R (just check the following full path in your environment; if you copy the following code from here, also check the double quote symbol you will past on your R IDE):
Sys.setenv(PATH=paste(Sys.getenv(“PATH”),”C:/Program Files/MiKTeX 2.9/miktex/bin/x64/”,sep=”;”))
Now let’s create a new “R package” project (File → New Project…):
Then I selected the folder where to create the new package project and the name of the package (I’ve chosen “AzureMLBostonRF”):
Be aware that
the folder in which the project directory will be created mustn’t be placed in a VM shared folder. It must be a local one. During the building phase (we‘ll see it in a while) I got an error like: “Error 1 (Incorrect function) Features disabled: R source file indexing, Diagnostics” because I tried to use a virtual machine shared folder to store it.
Clicking on the “Create Project” option, a new hello.R file will be shown in RStudio. If you read the included comments, you understand that a new R script in this project will be associated to a new function in the package. You can close the hello.R file, since we’ll not use it. Under the hood a new “AzureMLBostonRF” folder is created into the chosen directory:
This folder will be the working directory for the new project and it represents the default working folder for the scripts we’ll include in the package. So it’s important to copy the bos_rf.rda file created in the previous section into the AzureMLBostonRF folder. At this point let’s just create a new R script (the shortcut is
CTRL + SHIFT + N) and just load the previous saved model into a variable:
Just save this file (
CTRL + S ) with the name “get_model_rda” (the IDE will automatically give the R extension to the file name).
Now let’s create another R script that will contain the predictive function:
Save this file with the name “predict_from_model”.
The above mentioned “AzureMLBostonRF” folder contains the “R” subfolder, which now contains three R script files:
The hello.R file has to be deleted, it is just a function sample file.
Now open the DESCRIPTION file from the Files tab in RStudio
and then modify it in this way:
Feel free to change whatever you want according to this guide, but keep the “Imports” section as is, since the randomForest package is mandatory for the predictive function we want to deploy.
Let’s go now in the “man” folder and rename the “hello.Rd” file you’ll find there (related to the auto generated file we have already deleted) in “predict_from_model.Rd”:
It is a file used as documentation for the function we want to deploy. You can get more details about Rd files here. Open and modify it as following:
Now everything is ready to build the package with success. Click on the “Install and Restart” button in the “Build” tab on the top-right of RStudio.
If everything goes right, a “DONE” message will be found in the build log:
The R session will be automatically restarted and the new built package will be loaded:
Now open a new R Script and test the predictive function using this chunk of code:
The result will be the following:
The package is ready to be used later on Azure:
The above mentioned package folder has to be zipped as-is, without renaming the archive. Once the AzureMLBostonRF.zip file is ready, it has to be zipped again in order to prepare it in a wrapper file to be uploaded in Azure Machine Learning Studio:
Use The R Package In Azure Machine Learning Studio
After uploaded in Azure Machine Learning Studio, the AzureMLBostonRF-AzureMLStudioWrapper.zip file can be used as input in a Execute R Script module. Let’s create the following new experiment:
The code used in the “Get the test dataframe” module is the same of the previous experiment at the beginning of this post. The code of the “Predict from custom model in a package” module is the following:
Note that in this case we don’t need to load the randomForest library, since it is “embedded” into our custom AzureMLBostonRF library.
As you can expect, the output data frame of the “Predict from custom model in a package” module is the same of fig.15:
Publishing The Web Service
Getting the previous training experiment as the starting point (if you get the first one it will be the same), after clicking on “Set up web service” and after changing the source of the “web service output”, the predictive experiment will be the following:
Now run the experiment to validate the workflow and then click on “Deploy web service”. After that, just click on the “New Web Services Experience” link:
You’ll get redirect to a new web service home page, where you can choose to test the endpoint just configured:
At this point, after selecting “Enable test data”:
you can test the just published predictive web service using the default values you can find in the text boxes and pushing the “Test Request-Response” button:
Measuring The Web Service Response Time
Azure Machine Learning Web Service performance may be adequate for small models and limited throughput, but this may not be true in case of more complex models. Azure Machine Learning Studio environment is a managed service, so you are not in control of the physical resources behind it. That’s why you may need to check if any performance issues occur.
First of all, the R code needed to consume the just deployed web service is already available in the Microsoft Azure Machine Learning Web Services home:
Using this chunk of code (a little bit modified) in RStudio, let’s evaluate the average web service response time after 50 sequential requests using the following code:
Just keep in mind you have to change a couple of strings in the code (api_key and url) based on your workspace and on the auto generated code of your deployed web service. After that, here the results you’ll have:
Our web service response time is 678 ms in average. Not bad for this simple predictive model!
Keep in mind that performance issues may occur due to R model (and variables) initialization. If this may be your case, you can initialize all the stuff once (just at the first web service call) without the need to do it for all the calls. For more details about this implementation, please check this link.
If you need to manage a lot of transactions per month, different plans are available based on the the number of transactions and the API compute hours. You can check all the details on this page.
A lot of customers adopting R as scripting language for Advanced Analytics don’t have a straightforward solution to easily deploy their custom predictive functions on the Microsoft cloud.
In order to avoid really complex (and expensive) architectures on Azure, two different ways to deploy custom R predictive models and functions using the Azure Machine Learning Studio Web Services have been shown.
Don’t forget to measure the web service performance before use it in production, since complex models may require not negligible time to be unserialized.