Machine Learning with R

Amazing ML libraries to use in R

Zoshua Colah
Nov 8, 2018 · 7 min read
Image result for machine learning r
Source: https://cdn-images-1.medium.com/max/1200/1*zkCV5S7wgkghdp0r5DNWLw.png

The no-nonsense guide to Machine Learning libraries to use in R

  • AnomalyDetection - AnomalyDetection R package from Twitter.
  • ahaz — Regularization for semiparametric additive hazards regression.
  • arules — Mining Association Rules and Frequent Itemsets
  • bigrf — Big Random Forests: Classification and Regression Forests for Large Data Sets
  • bigRR — Generalized Ridge Regression (with special advantage for p >> n cases)
  • bmrm — Bundle Methods for Regularized Risk Minimization Package
  • Boruta — A wrapper algorithm for all-relevant feature selection
  • BreakoutDetection- Breakout Detection via Robust E-Statistics from Twitter.
  • bst — Gradient Boosting
  • CausalImpact- Causal inference using Bayesian structural time-series models.
  • C50 — C5.0 Decision Trees and Rule-Based Models
  • caret - Classification and Regression Training
  • Clever Algorithms For Machine Learning
  • CORElearn — Classification, regression, feature evaluation and ordinal evaluation
  • CoxBoost — Cox models by likelihood based boosting for a single survival endpoint or competing risks
  • Cubist — Rule- and Instance-Based Regression Modeling
  • e1071 — Misc Functions of the Department of Statistics (e1071), TU Wien
  • earth — Multivariate Adaptive Regression Spline Models
  • elasticnet — Elastic-Net for Sparse Estimation and Sparse PCA
  • ElemStatLearn — Data sets, functions and examples from the book: “The Elements of Statistical Learning, Data Mining, Inference, and Prediction” by Trevor Hastie, Robert Tibshirani and Jerome Friedman
  • evtree — Evolutionary Learning of Globally Optimal Trees
  • forecast — Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models
  • forecastHybrid — Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the “forecast” package
  • prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
  • FSelector — A feature selection framework, based on subset-search or feature ranking approches.
  • frbs — Fuzzy Rule-based Systems for Classification and Regression Tasks
  • GAMBoost — Generalized linear and additive models by likelihood based boosting
  • gamboostLSS — Boosting Methods for GAMLSS
  • gbm — Generalized Boosted Regression Models
  • glmnet - Lasso and elastic-net regularized generalized linear models
  • glmpath — L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
  • GMMBoost — Likelihood-based Boosting for Generalized mixed models
  • grplasso — Fitting user specified models with Group Lasso penalty
  • grpreg — Regularization paths for regression models with grouped covariates
  • h2o - Deeplearning, Random forests, GBM, KMeans, PCA, GLM
  • hda — Heteroscedastic Discriminant Analysis
  • ipred — Improved Predictors
  • kernlab — kernlab: Kernel-based Machine Learning Lab
  • klaR — Classification and visualization
  • kohonen — Supervised and Unsupervised Self-Organising Maps.
  • lars — Least Angle Regression, Lasso and Forward Stagewise
  • lasso2 — L1 constrained estimation aka ‘lasso’
  • LiblineaR — Linear Predictive Models Based On The Liblinear C/C++ Library
  • lme4 - Mixed-effects models
  • LogicReg — Logic Regression
  • maptree — Mapping, pruning, and graphing tree models
  • mboost — Model-Based Boosting
  • Machine Learning For Hackers
  • mlr - Extensible framework for classification, regression, survival analysis and clustering
  • mvpart — Multivariate partitioning
  • MXNet - MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.
  • ncvreg — Regularization paths for SCAD- and MCP-penalized regression models
  • nnet — eed-forward Neural Networks and Multinomial Log-Linear Models
  • oblique.tree — Oblique Trees for Classification Data
  • pamr — Pam: prediction analysis for microarrays
  • party — A Laboratory for Recursive Partytioning
  • partykit — A Toolkit for Recursive Partytioning
  • penalized — L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
  • penalizedLDA — Penalized classification using Fisher’s linear discriminant
  • penalizedSVM — Feature Selection SVM using penalty functions
  • quantregForest — quantregForest: Quantile Regression Forests
  • randomForest — randomForest: Breiman and Cutler’s random forests for classification and regression.
  • randomForestSRC — randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
  • ranger — A Fast Implementation of Random Forests.
  • rattle — Graphical user interface for data mining in R.
  • rda — Shrunken Centroids Regularized Discriminant Analysis
  • rdetools — Relevant Dimension Estimation (RDE) in Feature Spaces
  • REEMtree — Regression Trees with Random Effects for Longitudinal (Panel) Data
  • relaxo — Relaxed Lasso
  • rgenoud — R version of GENetic Optimization Using Derivatives
  • rgp — R genetic programming framework
  • Rmalschains — Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R
  • rminer — Simpler use of data mining methods (e.g. NN and SVM) in classification and regression
  • ROCR — Visualizing the performance of scoring classifiers
  • RoughSets — Data Analysis Using Rough Set and Fuzzy Rough Set Theories
  • rpart — Recursive Partitioning and Regression Trees
  • RPMM — Recursively Partitioned Mixture Model
  • RSNNS — Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
  • Rsomoclu — Parallel implementation of self-organizing maps.
  • RWeka — R/Weka interface
  • RXshrink — RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
  • sda — Shrinkage Discriminant Analysis and CAT Score Variable Selection
  • SDDA — Stepwise Diagonal Discriminant Analysis
  • SuperLearner and subsemble — Multi-algorithm ensemble learning packages.
  • svmpath — svmpath: the SVM Path algorithm
  • tgp — Bayesian treed Gaussian process models
  • tree — Classification and regression trees
  • varSelRF — Variable selection using random forests
  • xgboost- eXtreme Gradient Boosting Tree model, well known for its speed and performance.
  • AnomalyDetection - AnomalyDetection R package from Twitter.
  • ahaz — Regularization for semiparametric additive hazards regression.
  • arules — Mining Association Rules and Frequent Itemsets
  • bigrf — Big Random Forests: Classification and Regression Forests for Large Data Sets
  • bigRR — Generalized Ridge Regression (with special advantage for p >> n cases)
  • bmrm — Bundle Methods for Regularized Risk Minimization Package
  • Boruta — A wrapper algorithm for all-relevant feature selection
  • BreakoutDetection - Breakout Detection via Robust E-Statistics from Twitter.
  • bst — Gradient Boosting
  • CausalImpact - Causal inference using Bayesian structural time-series models.
  • C50 — C5.0 Decision Trees and Rule-Based Models
  • caret - Classification and Regression Training
  • Clever Algorithms For Machine Learning
  • CORElearn — Classification, regression, feature evaluation and ordinal evaluation
  • CoxBoost — Cox models by likelihood based boosting for a single survival endpoint or competing risks
  • Cubist — Rule- and Instance-Based Regression Modeling
  • e1071 — Misc Functions of the Department of Statistics (e1071), TU Wien
  • earth — Multivariate Adaptive Regression Spline Models
  • elasticnet — Elastic-Net for Sparse Estimation and Sparse PCA
  • ElemStatLearn — Data sets, functions and examples from the book: “The Elements of Statistical Learning, Data Mining, Inference, and Prediction” by Trevor Hastie, Robert Tibshirani and Jerome Friedman
  • evtree — Evolutionary Learning of Globally Optimal Trees
  • forecast — Timeseries forecasting using ARIMA, ETS, STLM, TBATS, and neural network models
  • forecastHybrid — Automatic ensemble and cross validation of ARIMA, ETS, STLM, TBATS, and neural network models from the “forecast” package
  • prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
  • FSelector — A feature selection framework, based on subset-search or feature ranking approches.
  • frbs — Fuzzy Rule-based Systems for Classification and Regression Tasks
  • GAMBoost — Generalized linear and additive models by likelihood based boosting
  • gamboostLSS — Boosting Methods for GAMLSS
  • gbm — Generalized Boosted Regression Models
  • glmnet - Lasso and elastic-net regularized generalized linear models
  • glmpath — L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model
  • GMMBoost — Likelihood-based Boosting for Generalized mixed models
  • grplasso — Fitting user specified models with Group Lasso penalty
  • grpreg — Regularization paths for regression models with grouped covariates
  • h2o - Deeplearning, Random forests, GBM, KMeans, PCA, GLM
  • hda — Heteroscedastic Discriminant Analysis
  • ipred — Improved Predictors
  • kernlab — kernlab: Kernel-based Machine Learning Lab
  • klaR — Classification and visualization
  • kohonen — Supervised and Unsupervised Self-Organising Maps.
  • lars — Least Angle Regression, Lasso and Forward Stagewise
  • lasso2 — L1 constrained estimation aka ‘lasso’
  • LiblineaR — Linear Predictive Models Based On The Liblinear C/C++ Library
  • lme4 - Mixed-effects models
  • LogicReg — Logic Regression
  • maptree — Mapping, pruning, and graphing tree models
  • mboost — Model-Based Boosting
  • Machine Learning For Hackers
  • mlr - Extensible framework for classification, regression, survival analysis and clustering
  • mvpart — Multivariate partitioning
  • MXNet - MXNet brings flexible and efficient GPU computing and state-of-art deep learning to R.
  • ncvreg — Regularization paths for SCAD- and MCP-penalized regression models
  • nnet — eed-forward Neural Networks and Multinomial Log-Linear Models
  • oblique.tree — Oblique Trees for Classification Data
  • pamr — Pam: prediction analysis for microarrays
  • party — A Laboratory for Recursive Partytioning
  • partykit — A Toolkit for Recursive Partytioning
  • penalized — L1 (lasso and fused lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model
  • penalizedLDA — Penalized classification using Fisher’s linear discriminant
  • penalizedSVM — Feature Selection SVM using penalty functions
  • quantregForest — quantregForest: Quantile Regression Forests
  • randomForest — randomForest: Breiman and Cutler’s random forests for classification and regression.
  • randomForestSRC — randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC).
  • ranger — A Fast Implementation of Random Forests.
  • rattle — Graphical user interface for data mining in R.
  • rda — Shrunken Centroids Regularized Discriminant Analysis
  • rdetools — Relevant Dimension Estimation (RDE) in Feature Spaces
  • REEMtree — Regression Trees with Random Effects for Longitudinal (Panel) Data
  • relaxo — Relaxed Lasso
  • rgenoud — R version of GENetic Optimization Using Derivatives
  • rgp — R genetic programming framework
  • Rmalschains — Continuous Optimization using Memetic Algorithms with Local Search Chains (MA-LS-Chains) in R
  • rminer — Simpler use of data mining methods (e.g. NN and SVM) in classification and regression
  • ROCR — Visualizing the performance of scoring classifiers
  • RoughSets — Data Analysis Using Rough Set and Fuzzy Rough Set Theories
  • rpart — Recursive Partitioning and Regression Trees
  • RPMM — Recursively Partitioned Mixture Model
  • RSNNS — Neural Networks in R using the Stuttgart Neural Network Simulator (SNNS)
  • Rsomoclu — Parallel implementation of self-organizing maps.
  • RWeka — R/Weka interface
  • RXshrink — RXshrink: Maximum Likelihood Shrinkage via Generalized Ridge or Least Angle Regression
  • sda — Shrinkage Discriminant Analysis and CAT Score Variable Selection
  • SDDA — Stepwise Diagonal Discriminant Analysis
  • SuperLearner and subsemble — Multi-algorithm ensemble learning packages.
  • svmpath — svmpath: the SVM Path algorithm
  • tgp — Bayesian treed Gaussian process models
  • tree — Classification and regression trees
  • varSelRF — Variable selection using random forests
  • xgboost - eXtreme Gradient Boosting Tree model, well known for its speed and performance.

Thank you for reading. A big thank you to https://github.com/qinwf/awesome-R#2018

Data Science Library

Zoshua Colah

Written by

Information Specialist and Educator aiming to make the world a better place one step at a time

Data Science Library

Your go-to for data science resources, readings, tools and more

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade