Analytics Vidhya
Published in

Analytics Vidhya

Azure Machine Learning : What I learned

Azure machine learning studio is a great place for data science experiments. We can create experiments using a visual designer or with code from IDE like Visual Studio Code and use Jupyter notebooks with programming languages like python and R. Let’s first see the lifecycle of any data science project:

There are 3 major lifecycle phase in datascience:

  • Data Acquisition and Understanding
  • Modeling
  • Deployment
  • Model Monitoring
Source : Course studies

Based on process requirements the options can be followed with the questions, what and how do we want to use? Following picture gives pretty much clear idea about how we would like to build, develop, deploy and consume the model.

Source : Microsoft

Let’s understand some of the concepts of Azure Machine Learning workspace. Once we create Azure ML studio in Azure, we can navigate through various options:

Source : Microsoft
  • Datastores : Data stores are used to securely connect and provide abstraction layers for sources like (share, databases or storage) to be consumed by data-sets.
  • Datasets : Dataset reference Datastore for the use in experiments.
  • Compute : In managing Compute, we can create different compute target for running the ML experiments.
  • Assets : Assets are used to create, train, deploy, and monitor Machine Learning Algorithms.
  • Author : This section is used to create Notebooks, Experiments with Auto ML en visual designer

Creating an Experiments:

Once Datastore is created and dataset is referenced with datastore, it can be used imported to workspace to create experiments.

While the designer offers no code solutions, there are a number of significant advantages to working within a notebook. First, the processes reproducible. Other users can open your notebook, step through, modify or rerun any of the cells. They can copy the notebook or simply save and checkpoint a different version. This makes collaboration much easier than working with the designer. In addition, you can annotate your code with markdown cells. This allows you to add comments, reference other notebooks or websites and solicit feedback or recommendations from other users. And finally, you can share your work in a number of formats.

In the designer, we can manage the azure machine learning data sources through the user interface and set up the connections to files in the azure blob storage, a SQL database, a website. And then, we can manage and register data sets with the user interface we can then see all of the data sets in our pipeline and simply dragged them onto the workspace.

Cleaning, Normalizing and Transforming Raw Data

In this proces we take input data and transform it so that the data we use for our machine learning experiments in the optimal form for generating our models. This process includes cleaning, normalizing and transforming our data.

Outliers error observations, which fall outside of the expected range, for example, 999,000 millimeters of rain falling in an hour. The first question we must ask is whether the observation is a measurement error or a data error or if it is a true outlier.

We may consider an observation to be an outlier if it is:

  • Outside the standard deviation
  • or outside the Interquartile range.

Training, Evaluating and Refining Machine Learning Models

Let’s look at the landscape of machine learning algorithms. Different types of machine learning algorithms are used to solve different types of problems.

The Azure Machine Learning Algorithm Cheat Sheet helps you choose the right algorithm from the designer for a predictive analytics model.

Azure Machine Learning has a large library of algorithms from the classification, recommender systems, clustering, anomaly detection, regression, and text analytics families. Each is designed to address a different type of machine learning problem.

Download the cheat sheet here: Machine Learning Algorithm Cheat Sheet

Deployment and Machine Learning Pipelines

There are a number of scenarios for deployment with the Azure Machine Learning Studio. Machine learning models can be deployed to Azure machine learning compute instances to the Azure kubernetes services and to Azure container instances and deploy a GPU enabled model.

Let’s open this project in the designer and then scroll down to the train model module, clicking on output and logs. I can see the trained model under data outputs. The button on the right allows me to register this model.

There are four steps to preparing an Azure machine learning model for deployment.

  • First, Define the inference environment. The Inference Environment defines how to set up the web service that will contain your model in Python.
  • Next, define the scoring code. The scoring code or entry script receives the data submitted to the deployed web service and passes it to the model once again in the automated ML experiment.
  • Net step is to define the inference configuration. The inference configuration puts everything together. It defines these environment configuration, the entry script and any other components needed to run the model as a service.
  • The final step in preparing to deploy model is to profile model profiling, test the service of the model and returns information such as the CPU usage, memory usage and response.

Deploying the model is made very with Azure Machine Learning Studio and selecting the latest run experiment, where can select the Model with in the Best model summary. The Deploy option gives us the option to deploy the model on Azure Kubernetes Cluster or Container Instance.

While deploying the model, we can set the checkbox to Enable the Application Insights diagnostics for data collection

Once model is deployed, it is available for the consumption as Web-service and can be consumed via Azure Machine Learning Test service or with Api Testing tools.

In this article, I have tried to keep the information of each topic of Azure Machine Learning very much on high level. In the following articles I will try to cover in-depth scenarios about Notebooks, AutoML and internals of Machine Learning.

Till then, Keep learning keep sharing.



Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Milind Chavan

An Azurer, Web developer, Technologist, Writer, Poet, Runner. Opinions are my own.