Azure Machine Learning Service
Data Science & Azure Machine Learning Service — An introduction
Background
I have been practicing data science — developed ML pipelines for Financial services, Pharmaceuticals, Health Care and Consulting domains. I have developed both Machine learning and Deep learning models. I have recently cleared DP-100 Microsoft Azure data scientist certification. As part of my passion for continuous learning, I am planning to write a series of articles on data science, covering machine learning and deep learning, from Microsoft Azure Machine Learning Service point of view.
Data Science — A generic overview
First, a brief synopsis on data science. In a nutshell, data science is all about data — it facilitates understanding data from statistical point of view. In general, data science is an intense quantitative field, requiring strong fundamentals in mathematics, statistics, computational, data structures & programming. In today’s data driven world, data science broadly supports three inter-related areas: AI, Machine Learning, and Deep Learning. Their relationship is shown below:
With this simple background, without much ado, let us focus on our topic of interest: how do we leverage feature-rich powerful Azure Machine Learning to solve our machine learning business cases.
Azure Machine Learning
So, what is Azure Machine Learning? It provides a cloud-based platform for training, deploying, and managing ML models. It is a platform for operating machine learning workloads in the cloud. Azure Machine Learning supports all activities associated with a typical machine learning business case:
- Understanding data
- Data preprocessing, including featurization
- Descriptive statistics
- Model building
- Model tuning — Hyperparameters
- Model evaluation
- Model deployment (MLOps pipeline)
In addition, it supports other powerful and useful features such as Data Drift. Data drift involves identifying issues with data changes over a period of time on the trained model. Taking timely action using data drift metrics helps avoiding many pitfalls that data scientists face in today’s production set up.
Azure ML Architecture
The following diagram shows Azure Machine Learning architecture:
As you can see in the above diagram, Azure Machine Learning (henceforth: Azure ML) leverages its resource horsepower from Azure cloud. This combination of cloud and data science makes it extremely powerful to design and train resource intensive machine learning models. Particularly, when a machine learning model is intense (in terms of computation. Ex: Boosting, neural networks, etc.) with sufficient loads of data, cloud based Azure ML could supply powerful compute target to train it much faster.
Features supported by Azure Machine Learning
As shown in the diagram above, out of the box, Azure ML supports following business critical features:
1. Scalable on-demand compute: Azure ML supports many computes (aka. compute targets), including its powerful compute cluster. It is scalable auto on-demand compute usage for machine learning workloads. That means, in a business sense, you pay only for what you use.
2. Data Storage and connectivity: Provides an easy-to-use, yet powerful data storage layer. It is so flexible, yet poweful, that data storage can be easily integrated with Python Pandas dataframe and Spark dataframe
3. Metrics and monitoring: Feature is very helpful to data scientists and machine learning engineers. You can get various metrics for a given business case and often, it helps you to see how different metrics are behaving
Other features such as ML orchestration using Pipeline and model registration
Machine Learning features
Azure Machine Learning platform provides means to implement both machine learning and deep learning. In a broad sense, it supports following machine learning types covering both supervised and unsupervised learning:
1. Classification
2. Regression
3. Time series forecasting
4. Clustering
5. Recommenders
Two powerful Azure ML features that are quite helpful to both seasoned data scientists and those who do not have required mathematical background are:
1. Designer — a drag and drop feature using Modules (see below for definition of modules). This is basically a no-code machine learning feature
2. AutoML — Automated ML. Using this feature, you will let Azure ML choose and run group of algorithms and decide the best algorithm based on a metric. For example, the best classification algorithm could be chosen on ROC AUC curve. It is feature rich and very flexible that the user could block a list of algorithms from choosing.
In Azure ML, a module (UI component) represents a set of code that can run independently and perform a machine learning task, given the required inputs. A module might contain a particular algorithm, or perform a task that is important in machine learning, such as missing value replacement (for data preprocessing), or statistical analysis.
What Next?
This article covered summary benefits and features available in Azure Machine Learning service. In the next few articles, I will cover some of the Azure resource level topics such as Workspace, Data Storage, and so on. Ultimately, my focus is going to be covering Azure ML Service features and its support for Machine Learning and Deep Learning. I personally feel that Azure ML service in the current form is either missing some of the business critical algorithms or not covered sufficiently. I may touch these topics in the future articles.
Please stay tuned for my next article on this series. In the meantime, I request readers providing suggestions / feedback for improvement.