Model Training Components in Azure Machine Learning: An Overview

Joanne Jons
Let’s Deploy Data.
3 min readAug 5, 2020

Machine learning is a data science technique used to extract patterns from data, allowing computers to identify related data and forecast future outcomes, behaviours and trends.

Simply put, machine learning involves functions to which data and answers are input and a set of rules are output.

A machine learning process mainly consists of the following steps:

Microsoft Azure Machine Learning Studio is a cloud-based environment that can be used to train, deploy, automate, manage and track machine learning models. It is a managed service that provides a comprehensive environment to implement model training processes and lays out a centralised place to work with all the artefacts involved in the process. The Azure Machine Learning Workspace provides tools which help in building, training and tracking highly accurate machine learning and deep-learning models using programming languages like Python/R or low-code options like the designer.

A typical workflow:

Source: Microsoft Azure Documentation

The ultimate goal of model training is to produce a model that predicts the value of some output feature y, given a set of input features X. Model training is the phase during which the model ‘learns’.

The first step is to select an appropriate algorithm depending on the type of problem at hand. There are mainly three methods of learning: supervised learning, unsupervised learning and reinforcement learning. Based on the algorithm chosen, hyperparameters are set before the training starts.

Once the dataset is analysed, cleaned and prepared, it is split into three parts: Training data, Validation data and Test data. Model training is an iterative process. It involves learning the model parameters and fine tuning the hyperparameters. Training data is used to learn the values for the model parameters while validation data is used to check the model’s performance and fine tune the hyperparameters. Since hyperparameters are set before the training, it’s best to tune them based on the performance of the model. Test data is used for evaluating the model in the evaluation phase.

Azure Machine Learning requires a computer target to be created and configured. A compute target is a cloud-based workstation that provides the user access to various deployment environments, such as Jupyter Notebooks.

Machine Learning training scripts can be developed in Python/R or using the visual designer. The visual designer in Azure Machine Learning will enable most of the things that can be done through code.

The tasks involved in a machine learning model training process are grouped into a container called the Experiment, which makes it easy to organise. One of the most important tasks in the process is the Run. The Run is a process that is delivered and executed in the compute resource. Training the model is an example of Run. Run will output a set of artefacts namely snapshot, output files, metrics and logs. The Model Registry is where all the models in the Azure Machine Learning workspace is kept track of.

Taxonomy of Azure Machine Learning Workspace (Source: Udacity Introduction to Machine Learning on Azure)

Content Reference:
Microsoft Scholarship Foundation Course (offered through Udacity)
Microsoft Azure Documentation

--

--