Integrate Azure ML Studio with PowerApps and Power Automate: Part 1

Sahil Srivastava
Analytics Vidhya
Published in
5 min readJul 9, 2020

This is a two part series where in first part I will walk you through with the process of creating a Predictive model with Azure Machine Learning Studio. Then in the next part I will integrate Machine Learning Web Service with PowerApps using Power Automate. Basically, it is an end-to-end Machine Learning model Deployment using Azure ML studio, PowerApps and Power Automate.

Introduction to Azure Machine Learning Studio

Microsoft Azure Machine Learning Studio (classic) is a collaborative, drag-and-drop tool you can use to build, test, and deploy predictive analytics solutions on your data. Azure Machine Learning Studio (classic) publishes models as web services that can easily be consumed by custom apps or BI tools such as Excel. Machine Learning Studio (classic) is where data science, predictive analytics, cloud resources, and your data meet.

Now without any delay let’s get started with our modelling.

On a broader level, there are 5 steps to create any Machine Learning Model:

  1. Data Gathering
  2. Data Cleaning and Preprocessing
  3. Training Data Model
  4. Scoring Data Model
  5. Evaluating Data Model

Setting up Environment

  1. Sign into your Microsoft Account and go to https://studio.azureml.net.
  2. If you are new user then Sign up using Free Workspace.
  3. Create a New Experiment and you are all set to start creating your model.

I will take an example of a Sample Dataset named “Adult Census Income Binary Classification dataset” and going to predict whether any given adult income is above or below $50k per year.

NOTE: Every component in Studio is referred as method which works on drag and drop. After adding every method you need to connect it with previous method and run it.

Data Gathering

  1. Select Saved Datasets from Navigation menu.
  2. Drag and drop “Adult Census Income Binary Classification dataset” from the Samples.
fsagsdgsdg

3. Right click on Dataset and select visualize to check all columns and rows in dataset.

Head of Adult Census Income dataset

Data Cleaning

  1. First we need to check missing values, if any replace them with 0.

2. Add method Summarize Data, connect it with dataseta and after running visualize it. It will show statisitical summary of dataset.

3. As we can see there are many Missing values, thus choose method Clean Missing Data and visualize it.

4. Now drag Select Columns in Dataset to select only those features which we need as our training data.

5. There are many categorical columns which are String Feature types, those need to be converted to categorical type. Select Edit Metadata method and select all columns which needs conversion and run it.

6. Now if we visualize our dataset and specifically Target feature i.e., income, there is a major class imbalance problem in it.

Class imbalance problem basically means our output label is not balanced. It may cause Machine learning algorithm to bias towards a particular class based on the maximum number of output like in our case ≤50k, which will show inaccuracy in evaluation of our model. We can overcome this problem by Undersampling(downsampling) the majority class. We will use SMOTE module to increase the number of underrepresented cases in a dataset used for machine learning. SMOTE is a better way of increasing the number of rare cases than simply duplicating existing cases.

7. Select SMOTE method and run it to undersample the target data.

Training Data Model

  1. Before training the data, it is necessary to split the data into train and test set. We use Split Data module and join it with SMOTE and run it. We can select the Test size and random seed from properties.
Split data method

2. Now we will train our data using a Machine Learning Classification algorithm called Two-Class Boosted Decision Tree. In Properties select Parameter Range in Create Trainer mode which will enable all the Hyperparameters which needs to be tuned.

3. Select Tune Model Hyperparameters from navigation and connect its one input from Two-Class Boosted Decision Tree and another one from Split data.

4. Now change it’s properties as given following.

Properties of Tune Model Hyperparameters

Scoring Data Model

Select Score Model from navigation and connect one input from Tune Model Hyperparameters and other from test set of Split data.

Evaluating Data Model

Now select Evaluate Model and join from Score Model and run it.

Adult Census Income Predictive Model

Let’s Visualize our results

I will compare two Models, one with downsampling and one without downsampling the target on the basis of ROC curve and Accuracy. These are the classification metrics which defines how efficient the model is.

Accuracy is the fraction of predictions our model got right. It is the ratio of Number of correct predictions to the Total number of predictions. Accuracy is the perfect metric in case of balanced dataset but with a class-imbalanced data set where there is a significant disparity between the number of positive and negative labels, it alone doesn’t tell the full story.
An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds.

Model A: Model with downsampling (Blue curve)
Model B:Model without downsampling (Red curve)
  • Accuracy of Model A is more than that of Model B at a threshold of 0.5.
  • ROC curve is more positive in Model A than in Model B.

Thus it can be concluded that Model A i.e., model with downsampling creates more accurate and efficient model.

Conclusion

In this blog we covered how to create a predictive model with Azure ML Studio. In the next part of the series, we will create a web service of our model and integrate it with Power Apps and Power Automate.

Thanks for Reading!!!

References and further reading:

--

--