Rpy2 Made Easy
A consolidated guide at using Rpy2 for beginners
Two data scientists walked into a bar. Both of them wanted to use R from Python. Lo and behold — Rpy2.
Rpy2 (R’s embed in python) can be pretty intimidating to use and solely because it’s not as well fleshed out as its individual parts. Also because it is much easier to use R and python by themselves, but certain software architectures might require a consolidation of the two. Python’s intuitive data structures, visual libraries and great IDE’s mixed with R’s trusted packages make for a solid resource for a data scientist’s exploitation.
How to Setup
- Download Python3+ and R, if you haven’t already.
- Install the Rpy2 from here
Using R packages and function from Python
The following example below shows how to call existing R functions from python using appropriate packages. The imports to be made are the following:
#Import necessary packages
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
#Must be activated
pandas2ri.activate()
The below snippet shows how to import R objects and packages to python. The time series object and forecast package from R are used in the example. One needs to make sure these functionalities/packages work fine in R separately.
time_series=robjects.r('ts')
forecast_package=importr('forecast')
Time to use functions from forecast package in python:
#converting the training data into an R time series object
r_times_series_data=time_series(df["Actuals"].values,frequency=12)#fit the Time series into a model
#Using auto_arima
fit_arima=forecast.auto_arima(r_times_series_data,seasonal=False)#getting the forecast value
forecasted_arima=forecast_package.forecast(fit_arima,h=10,level=(95.0))
The variable “forecasted_arima” is a List Vector, an R datatype. One can print the columns names to find the desired columns. The following lines can be used to find the forecasted result:
#The 3rd index has the forecast value for Arima.
#NOTE: THE INDEX MAY NOT BE SAME FOR ALL MODELS
arima_output=forecasted_arima[3]
Alternatively, one can access the columns with their names as follows:
#Alternate way to find forecasted result
arima_output=forecasted_arima.rx2("mean")
Calling Existing R functions from Python
Suppose, one has a R code as follows:
#Saved in the file Forecast_r_function.r
library(forecast)
Forecast_r_function= function(actuals, freq){
y <- ts(actuals,frequency = freq)
fit <- auto.arima(y, seasonal=TRUE)
forecasted = forecast(fit, h=5)
return (forecasted)
}
The call the above forecast function from python, the following import needs to be made:
#Import the SignatureTranslatedAnonymousPackage
from rpy2.robjects.packages import STAP#Read the file with the R code snippetwith open('Forecast_r_function.r', 'r') as f:
string = f.read()#Parse using STAP
forecast_func_in_python= STAP(string, "Forecast_r_function")
The above step makes the R function available from python. It can be accessed by:
#Calling R function
forecasted_arima=forecast_func_in_python.Forecast_r_function(time_series, 10)
#storing result
arima_output=forecasted_arima.rx2("mean")
Conclusion
There are many other ways Rpy2 can be used in a python framework, including ways to write R code in python itself. However those are more prone to error that the ones elucidated here.
Working with both R and Python are done regularly by data scientists, and an interface to embed one on the other is truly powerful if used well. Despite it not being very popular, rpy2 can provide technical edge to data scientists and developers alike while designing and integrating a holistic systems.
Please read the rpy2 documentation thoroughly here.