Azure Machine Learning Foundation

Preeti Sharma
Simply Dev
Published in
10 min readJul 19, 2020

Part 1

Machine Learning the most profound term of the 20th century which will also be known as the ‘demanding job’ of this era. But What’s the reality behind it being the most demanding technology?

One word answer data.

For example, you have rice and you want to cook biryani. The whole process of getting good rice which is our data here. And make an output say biryani out of it will take place in a process. That is known as processing, techniques, and approaches that come under machine learning.

Types of machine learning :

4. Semi-Supervised:

→ Supervised + Unsupervised.

→ Less amount of data is labeled and large amount of data is unlabeled.

→ Self-Training: Training on the datastore multipurpose training.

→ Self-enable Training: Depend on the parameter training.

Ok now we have our data, we get our output but how can we make it user-accessible or how can we evaluate our output data. Here comes into play the use of Azure Machine Learning Studio.

Azure machine learning studio

Less code and more goal-oriented platform.

DataStore: As the name suggests a store that stores the dataset. The dataset could be balance or imbalance. Since we are working more on datastore. The datastore supported by Azure Machine learning service are:

Blob Container

File share

DataLakes

SQL

PostgreSQL

Databricks(Spark)

Q. What’s the role of azure services here?🤔

Azure dataset format :

Tabular: Creating passing no. of files

Web URL: Reference storing files or multiple files in your dataset.

Since we are working on Dataset we have to check the version as well for that we are dealing with Dataset Versioning.

Q. How is data Processed?

New Data → Release →Data Preparation → Features

Dataset versioning means predict/taking datasets in organizations.

If you have Numerical Data → Tabular Format

Translate text data if present → Numerical Dataset

Image Data → Matrix of top of RGB Channel of 500*500 pixels contains 3 values.🤔

Version Reference: Features or Instances

1. Feature Engineering : It means you can produce new features based on the values of existing features.It calculayes the performance . And transform the new inputs that will help us to improve the performance.

-> 1+feature(existing)

-> training a separate machine learning model to new features.

Tasks include in Feature Engineering are :

Aggregation: Mean, Median, Mode.

→Part-of: For e.g extracting a month from particular data.

→ Binning : Group of entities into bin and use them for example for calculating a particular range of customer age sending either happy, sad or angry emoji for a particular purchase.

2. Feature Selection: Select the most relevant features .

Dimensionality Reduction

3. Dimensionality Reduction: Technique: PCA, ML Models cannot accommodate large features then Principal Component Analysis. eg Customer Behaviour. 🤔

Approaches :

Principal Component Analysis (Statistical Approach)

T-Sne (Probabilistic Approach)

Feature Embedding: Feature embedding is an emerging research area that intends to transform features from the original space into a new space to support effective learning.

Terms and Concepts

  1. Flagging: Deriving Boolean values from the given dataset. For example, you have to the problem statement. Calculate the average amount of purchase. You will approach by questioning Does the person have the purchase of last month??

Frequency-based → Occurences → Embeddings

🤔🙄

In simple terms: Start with the problem statement, What’s the main aim we are focussed on? Go with the features with existing features.

Quiz⁉ :🙋🏼‍♀️

Azure pre-built Machine Learning models

Process :

Filter based model Selection 📳 : Identifying columns in input datset that have great distributed power.

Preemptive feature importance : Determine the best feature to use by computing to use by computing the feature important scores.

Quiz⁉🙋🏼‍♂️

Preemption

Preemption is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time. Such changes of the executed task are known as context switches. It is normally carried out by a privileged task or part of the system known as a preemptive scheduler, which has the power to preempt or interrupt and later resume, other tasks in the system.

So basically a dataset is divided into 3 parts:

  1. Training Data: The data that learn value from parameter.
  2. Validation data: To calculate Model Performance
  3. Test Data: Used to verify that a given set of input to a given function produces some expected result

Now coming to Azure the place where all the machine learning problems will be dealt with known as Workspace.

The workspace is a container for working with the components of machine learning.

Container that works with components of machine learning to organize the machine learning process.

Services that provide snapshot and versioning of trained models known as the model registry.

Cloud-based workstation to provide access to various development environments known as Compute Instance.

The threshold value needs to be 0 or 0.5 to make the performance better.

🤔🤔🤔🤔

Solution: Rather than training multiple models or using multiple models used ensemble learning or automated learning.

Classification

Single Vs multiple multiclass classifications

Multi-class Classification: A classification task with more than two classes.Multiclass classification makes the assumption that each sample is assigned to one and only one label.

Multi-label classification: assigns to each sample a set of target labels. This can be thought as predicting properties of a data-point that are not mutually exclusive, such as topics that are relevant for a document. A text might be about any of religion, politics, finance or education at the same time or none of these.

Modeling

predictive modeling

Scoring

Scoring model: Score against the test data. The output of train and test data.

The performance metric used here: Boosting algorithm on training data.

→ Model accuracy drifts over time.

→ The model training process does not finish after 1st position context deleting data drift

→ Monitor data drift alerts.

Ensemble Learning

🔡👉Boosting Algorithm :

→ Uses a strong learner from the weak learner.

→ Same input data multiple training models.

→ Reduce bias

→ Improvement in performance

-Bias, Variance and test error relationship

🔡👉Bagging :

→ Reducing Overfitting

→ Random Sampling

→ Reduce Variance

→ Equally weighted average

🔡👉Stacking :

Combining ouput of machine learning to have higher output.

AutoML💪 :

Roadmap

→ Input criteria →Score each algorithm pipeline→Parameters(Performance)

Input: Common aspects between entities and columns presence/absence to produce the results in unsupervised learning

Parameters: Inherit Data, Grouping Associates, Principal Component Analysis, clustering, feature extraction, anomly detection etc.

Clustering :

Applications of Clustering Algorithm :

Personalization and target marketing.

Document Classification

Fraud detection, House type prediction, Medical Image.

Types:

Difference between Hierarchical and cluster-based

👉Centroid-Based: Organizes data into clusters based on the data structure.

👉Hierarchical-Based: Clusters distributed in trees form.

👉👉

Distribution and density-based clustering

K-Means Clustering

→ Creates up two-target.

→ Minimize the initial problem statement.

→ Centroid type of clustering.

→ It comes under the unsupervised machine learning model.

Roadmap🗺

Initialize Centroid → Cluster Management → Move Centroids → Check for Convergence.

Points to keep in mind :

Number of centroids

Initialization approach

Distance-metric Euclidean

Normalize features

Assign label node

Iteration

Relationship between NN, DL, ML, and AI

Autoencoders

Train to reproduce inputs

Due to lack of resources no. of nodes are limited. Value is the labels in which autoencoders produce output.

It is a multi-layer presentation

Produces feature vector.

The parameter to train the autoencoder is the root mean squared error. Therefore in layman, the highest possible value of metric to determine possibility is the threshold.

📝Sometimes we use deep-learning encoders instead of one-hot encoding.

Specialized case of machine learning

👉Specialized case | Approach

Similarity Learning | Supervised

Test Classification | Supervised(Classification)

Feature Learning | Supervised (Classification), Unsupervised(Clustering)

Anomaly Detection | Supervised(Classification), Unsupervised(Clustering)

Forecasting(time-series) future-prediction | Supervised

Similarity Learning

Recommendation System: It means recommend something to the user according to his/her preferences.

1. Content-Based: Used features of both users and items.

2. Collaborative-Based Filtering: Only identifies for users and items specifically not their properties.Subdivision: Get information on rating. Implicit: History of purchases, Explicit : Giving rating status like 4/5.

Code for reference :

Forecasting :

The time series dataset considers under this method. It deals with the prediction in the context of the ordered dataset. It is a form of the multivariate regression.

Algorithm: ARIMA

Prophet: For e.g if the disease will not likely to occur but it occurs in the other scenarios.

Deployment

  1. Creating Clusters
  2. Inferencing Clustering
  3. Compute Instance: Modelling process and how they interact when used together.

Operationality Models

Automation via end-to-end pipelines through training models of compute instance. Azure Machine Learning Training Clusters are used for multi-nodes.

Training Clusters :

Multimode clusters : For training and batch clusters (influencing process). Can autoscale when submitting , support GPU and TPU. Resources are required in batch-scoring.

For machine learning python code.

Methods:

1. Real-time inferencing /influence clusters : Model training process may be compute-intensive with training time sthat can span across many hours , days or even weeks.A trained model is used to amke decisions on new data. The entities about new data is given ased on training. Making new decisions on new dat on- demand known as local inferencing.

2. Batch Inferencing : Making predictions on large existing data.Recurring on the datastore . Dat to be used assigned at the start. It’ s advantage is high throughotput and scalable targets.

3. Compute Target, Azure Kubernetes Service, Azure ML cluster

Trained model packaged in containers :

model-deployment to AKS

🙄🤔…

👉We know that Model = Algorithm+data+Hyperparametres

Custom Vision Architecture

Deploying a training Model

Deploying a credit risk model

👉Get the model file.

👉Create the scoring script

👉Optionally create a schema file describing the web service

👉Create a real-time scoring web service

👉Call the web service from your application

👉Repeat the process from the training module.

Advantages of deploying on Azure

  1. Fast Response Time
  2. Autoscaling
  3. FPGA’s
  4. Azure Python SDK
  5. Azure ML environment.

🙋‍♀️⁉QUIZ

I swear that’s the end part😅….

Responsible AI

Modern AI: It tells us about the application

Some examples and use-cases

Increasing Inequality: Features of medical dataset bias against the poor.

Web optimization: Physical attacks by email.

Unwanted Bias: Women are considered a weak gender in comparison to men.

Adversarial Attack: Cars self-driving signs.

Killer Drone: Harmful to humans.

Deep fake: Harmful to data.

Data Poisoning: Manipulating data of training

Approaches

Direct Explainer: Use a specific explainer for a specific model explanation.

→SHAP Tree: Tree explainer.

→ Shap deep tree to explain deep.

Meta Explainer: Tabular Explainer, Image Explainer, Text explainer. Explain use to create local and global visualization productions.

Fault Approach: State-of-art approach created by Microsoft. Analyses can be seen through the dashboard.

The end :

Feel free to shoot any queries:

https://www.linkedin.com/in/preeti-sharma-155a85181/

--

--