Azure DP 100 Prep: Hands-on with PyTorch and Azure ML SDK v2

Luis Monge
19 min readOct 13, 2023

--

Introduction

Within the intricate landscape of data science, certifications such as Azure DP 100 serve as more than mere distinctions. They act as structured pathways to up our game, particularly drawing attention to the extensive machine learning capabilities that Azure offers.

Now, I love a good tutorial as much as the next person, but nothing beats getting your hands dirty with actual work. That’s why for a while I have coupled learning new topics with a hands-on approach, you only know how much you really learned once you put your knowledge to practice. To prepare for this certification I decided to take a simple image classification training script developed locally and make it work with the latest SDK, Azure Machine Learning Python SDK v2. If this was a real use case it would need much more work, however, for the purposes of the certification it served as a useful example.

Surprisingly, comprehensive guides on this new SDK seem scarce. Recognizing this gap, I’ve penned this article. Whether you’re a newcomer to Azure or transitioning to the new SDK, my aim is to provide clarity on the certification and offer practical code insights for your Azure Machine Learning ventures. It will not cover everything but I sure hope it will complement your preparation.

Topics covered

  • Tools and Setup
  • Creating and using Workspaces
  • Creating an Azure Data Lake Gen2 storage for your project data
  • Creating a datastore and data asset
  • Choosing and creating compute
  • Handling permissions between resources
  • Creating a Custom environment
  • Executing jobs through commands

The code for this project can be found in my Github repository. I’m also quite open to suggestions on how to improve or chat about these and other topics, so feel free to connect with me on my LinkedIn profile.

Note: this article will not focus on the actual training script but on putting the pieces needed together to make your PyTorch project work with Azure.

Tools and Setup

For my projects I always use VSCode as my IDE. It is highly customizable and includes many useful extensions for interacting with Azure. I’d advice to install at least the following extensions:

  • Azure Account — sign in and subscription management from IDE.
  • Azure Machine Learning —interact with workspace and its resources.
  • Azure Machine Learning — Remote (used by the previous extension).
  • Azure Resources — viewing and managing resources.
  • Azure Storage — allows you to browse data in your storage accounts.

In order to work with Azure you will need to sign up. This will create a Tenant, which represents an organization, the highest level that contains everything else you will do on Azure. A Tenant can contain multiple Subscriptions (think of it for now as different organizations within a company) and within those subscriptions you will have sets of Azure resources called resource groups than can be used for specific use cases or applications. In this case, we will create a resource group to hold all the resources we will need to train this Image Classifier.

Important: remember to sign in to your Azure account from VSCode as stated in the extension page.

Workspaces

To work with Azure Machine Learning you’ll need a workspace. TLDR, it is the main place to collaborate and build your project using the resources within, such as compute, data assets, jobs, and services such as the no/low code Designer and AutoML. If you want to dive deeper here’s a link to the workspace documentation.

When we join Azure, a Machine Learning workspace is not created by default. You can create one from by typing Machine Learning in the portal’s search bar(much easier), through the CLI (perfect for automation) or the SDK (if you have a setup script for example).

This image from the Create an Azure Machine Learning workspace lesson module explains how Azure organizes from tenants down to the individual resources within a workspace:

Tenant, subscription resource group and workspace visual representation.

Having this in mind will be helpful when managing permissions between resources or even resource groups!

Creating an AzureML workspace

The most straight forward way to create a workspace is through the Portal, in which you can search for the Machine Learning resource an provision one right away by providing the relevant information. For this I recommend following the steps in the following mslearn practice module in the Provision an Azure Machine Learning workspace section.

Remember that if you want to do this in a more automated way you can use AzureML CLI and AzureML Python SDK v2.

from azure.ai.ml.entities import Workspace

workspace_name = "mlw-example"

ws_basic = Workspace(
name=workspace_name,
location="eastus",
display_name="Basic workspace-example",
description="This example shows how to create a basic workspace",
)
ml_client.workspaces.begin_create(ws_basic)

Connecting to a workspace

You can connect to your workspace from a script or Jupyter notebook using the following function:

from dotenv import load_dotenv

def get_workspace(verbose=False):

load_dotenv()

subscription_id = os.environ['SUBSCRIPTION_ID']
resource_group = os.environ['RESOURCE_GROUP']
workspace_name = os.environ['WORKSPACE_NAME']
credential = DefaultAzureCredential()

if verbose:
print(f"Resource Group: {resource_group} | Subscription: {subscription_id} | {workspace_name}")

ml_client = MLClient(
credential=credential, subscription_id=subscription_id, resource_group_name=resource_group, workspace_name=workspace_name,
)

return ml_client

ml_client = get_workspace(verbose=True)

The relevant information is stored in a .env file that is loaded when calling load_dotenv(), this is useful to avoid private information sharing when pushing to a Github repo. Note that .env should be added to the .gitignore.

From ml_client you can access pretty much anything, data, compute, environments, endpoints, models, ecc. The full list can be found here in the attributes section.

Creating an Azure Data Lake Gen2 storage for your project data

What is the real advantage of choosing Azure Data Lake Gen2 instead the out-of-the-box storage you may ask?

First things first, in order to understand the advantages of using the second generation of this type of storage, lets first review the definition of a Data Lake:

A data lake is a single, centralized repository where you can store all your data, both structured and unstructured. A data lake enables your organization to quickly and more easily store, access, and analyze a wide variety of data in a single location. With a data lake, you don’t need to conform your data to fit an existing structure. Instead, you can store your data in its raw or native format, usually as files or as binary large objects (blobs).

While Azure Data Lake Gen1 was a resource on its own, Azure Data Lake Gen2 is a set of capabilities you can use alongside the Blob Storage of your Azure Storage. It has many advantages (especially when handling higher amounts of data), however, I chose it for two reasons: hierarchical directory structure and because it is used a lot in enterprises so it will be useful when put in practice for a company.

The main capabilities include:

  • Hadoop-compatible access
  • Hierarchical directory structure
  • Optimized cost and performance
  • Fine gran security model
  • Massive scalability

If you want to go deeper into Azure Data Lake Gen2 I encourage you to go through the Introduction to Azure Data Lake Storage Gen2 mslearn module.

The easiest way to create an Azure Data Lake Gen2 is by going to the Portal and searching for Storage Account. Once you click on “Create”, a menu will show for you to start filling in the details. In the “Advanced” can choose the hierarchical namespace by toggling the checkbox. My Advanced section looked like this at the time of creation, but yours can be different depending on your needs.

Azure Data Lake Gen2 Advanced section in provisioning window.

Once you have provisioned the storage account, you’ll be able to see it among the resources belonging to your resource-group. When you access the storage account, you’ll be able to create containers (similar to directories) and add your files by upload. Given that my use case involved only a few hundred images divided into train/test directories, I chose to upload them manually through the Portal, however, if you have more complicated ingestion flows you should consider using Azure Data Factory.

Creating a datastore and data asset

What is this datastore which seems to be so important? The documentation it states the following:

“Represents a storage abstraction over an Azure Machine Learning storage account. Datastores are attached to workspaces and are used to store connection information to Azure storage services so you can refer to them by name and don’t need to remember the connection information and secret used to connect to the storage services.”

So after creating the storage account, we need to create a datastore in order to instruct Azure ML on where to find the connection information necessary for it to connect to the workspace.

Supported Azure storage services:

  • Azure Blob Container
  • Azure File Share
  • Azure Data Lake
  • Azure Data Lake Gen2
  • Azure SQL Database
  • Azure Database for PostgreSQL
  • Databricks File System
  • Azure Database for MySQL

When we head over to the Data section of the workspace and into Datastores you will notice there are four built-in datastores (two Azure Storage blob containers, and two Azure Storage file shares) and the Azure Data Lake Gen2 you just created, which are used as system storages by Azure Machine Learning.

  1. workspaceworkingdirectory
  2. workspacefilestore
  3. workspaceartifactstore
  4. workspaceblobstore

Click on Create and you will need to fill-in all the information required:

  • Datastore name: name of the Data Lake Gen2 you just created
  • Datastore type: Azure Data Lake Gen2
  • Check the “From Azure subscription” and select the storage account and blob container you just created
  • Authentication Type: select Account Key, head over to the storage account resource you just created, then to Access Keys and copy key1 and paste it into the datastore tab.
  • Check the option to “Use the workspace managed identity for data preview and profiling” (we’ll cover this in the Handling permissions between resources section).

Data Assets

Data Assets are powerful when executing machine learning tasks in jobs. As a job, you can run a Python script that takes inputs and generates outputs. A data asset can be parsed as both an input or output of an Azure Machine Learning job.

In Azure Machine Learning, data assets are references to where the data is stored, how to get access, and any other relevant metadata. You can create data assets to get access to data in datastores, Azure storage services, public URLs, or data stored on your local device.

Benefits

  • You can share and reuse data with other members of the team such that they don’t need to remember file locations.
  • You can seamlessly access data during model training (on any supported compute type) without worrying about connection strings or data paths.
  • You can version the metadata of the data asset.

Main types of data assets:

  • URI file: Points to a specific file.
  • URI folder: Points to a folder.
  • MLTable: Points to a folder or file, and includes a schema to read as tabular data. Useful for data with a frequently changing schema.

Creating a Data Asset for your image data

Head over to the “Data assets” section and click on “Create”, fill the name and choose UriFolder (as we point to a directory instead of a single file). In the next section, since you have already uploaded your image directories, choose “From Azure storage” and then choose the storage account you just created in the next section. After this you’re down to the last step, choosing the directory path to use. In my case I chose to point to fruit_classification_datasets directory since my file structure looks like this:

.
└── fruit_classification_datasets/
├── train/
│ ├── fruit1/
│ │ ├── fruit1_train_image1.jpg
│ │ ├── fruit1_train_image2.jpg
│ │ └── fruit2_train_image3.jpg
│ └── fruit2/
│ ├── fruit2_train_image1.jpg
│ ├── fruit2_train_image2.jpg
│ └── fruit2_train_image3.jpg
├── test/
│ ├── fruit1/
│ │ ├── fruit1_test_image1.jpg
│ │ ├── fruit1_test_image2.jpg
│ │ └── fruit1_test_image3.jpg
│ └── fruit2/
│ ├── fruit2_test_image1.jpg
│ ├── fruit2_test_image2.jpg
│ └── fruit2_test_image3.jpg

After clicking on “Create” the Asset will be provisioned after a small wait and you’ll be able to see it from the Data asset section:

Data Asset for image classification from the Data Assets section in AzureML studio.

URIs

To find and access data in Azure Machine Learning, you’ll use Uniform Resource Identifiers (URIs).

A URI references the location of your data. For Azure Machine Learning to connect to your data, you need to prefix the URI with the appropriate protocol. There are three common protocols when working with data in the context of Azure Machine Learning:

  • http(s): Use for data stores publicly or privately in an Azure Blob Storage or publicly available http(s) location.
  • abfs(s): Use for data stores in an Azure Data Lake Storage Gen 2.
  • azureml: Use for data stored in a datastore.

When you want to access the data from the Azure Machine Learning workspace, you can use the path to the folder or file directly. When you want to connect to the folder or file directly, you can use the http(s) protocol. If the container is set to private, you'll need to provide some kind of authentication to get access to the data.

Choosing and creating compute

Azure Machine Learning offers different computes to serve our project needs and the phase we are in.

Main training compute targets

  • Local Compute: useful for testing and for low amounts of data.
  • Azure Compute Cluster: scalable on demand, useful for executing jobs with different amounts of data (you can choose the one that best fits)
  • Azure Serverless Compute: similar to Compute Clusters but created, scaled, and managed by Azure Machine Learning upon running a job.
  • Azure Compute Instance: useful for developing on notebooks, testing on low amounts of data.
  • Attached Compute: if you already have a separate compute like an Azure Data Bricks cluster you may attach it to run jobs.
Different training targets and compatibility to Azure Machine Learning services.

Main inference compute targets

Once our model is ready for testing or production we will employ a different compute from the one used for training. Azure Machine Learning create a Docker container that hosts the model and prepares it for receiving requests and return the result of the inference.

The compute target you use to host your model will affect the cost and availability of your deployed endpoint. Use this table to choose an appropriate compute target.

  • Local web service
  • Azure Machine Learning Endpoints
  • Azure Machine Kubernetes
  • Azure Container Instances
Inference compute targets and their respective uses.

For my specific use case I could have used a Compute Instance if I had preferred to develop using notebooks. However, since I was developing locally and then running a job once in a while to test my code, I chose a basic Compute Cluster, which after 15 minutes of idleness shuts down. This allows me to spend just what I am consuming and avoiding the terrible nightmare (which has happened to me) of leaving a Compute Instance turned on for hours by accident!

To create a compute target, in this case a compute cluster, as usual the easiest way is through the Studio in which you’ll have to go to the Compute section, click create, and choose the most appropriate one based on the characteristics described above. You can assign a managed identity later in order to set up authentication.

Compute cluster after setup after being created.

If you decided to do this with the SDK v2 you can use the following code:

# remember to first instantiate your ml_client
from azure.ai.ml.entities import AmlCompute

cluster_basic = AmlCompute(
name="compute-cluster-test",
type="amlcompute",
size="STANDARD_DS3_v2", # for larger datasets consider using one with GPUs
location="westus",
min_instances=0,
max_instances=2,
idle_time_before_scale_down=120,
)
ml_client.begin_create_or_update(cluster_basic).result()

When you execute jobs you’ll be able to input the compute cluster name as one of the arguments. So we’ll cover that later!

Handling permissions between resources

This is a very relevant topic that unfortunately does not get addressed enough in the learning modules but is essential when working in an organization. Understanding how resources handle identities, permissions and roles will save you a lot of headaches. So let’s start.

Types of authentication to Azure Machine Learning

  • Interactive: Interactive authentication is used during experimentation and iterative development. Interactive authentication enables you to control access to resources (such as a web service) on a per-user basis.
  • Service principal: You create a service principal account in Azure Active Directory, and use it to authenticate or get a token. A service principal is used when you need an automated process to authenticate to the service without requiring user interaction. For example, a continuous integration and deployment script that trains and tests a model every time the training code changes.
  • Azure CLI session: The Azure CLI extension for Machine Learning (the ml extension or CLI v2) is a command line tool for working with Azure Machine Learning. You can sign in to Azure via the Azure CLI on your local workstation, without storing credentials in Python code or prompting the user to authenticate. Similarly, you can reuse the same scripts as part of continuous integration and deployment pipelines, while authenticating the Azure CLI with a service principal identity.
  • Managed identity: When using the Azure Machine Learning SDK v2 on a compute instance, compute cluster or on an Azure Virtual Machine, you can use a managed identity for Azure. This workflow allows the VM or compute to connect to the workspace using the managed identity, without storing credentials in Python code or prompting the user to authenticate. Azure Machine Learning compute clusters can also be configured to use a managed identity to access the workspace when training models. In the creation of the compute cluster

Azure Active Directory

If you want users to authenticate using individual accounts, they must have accounts in your Azure AD. If you want to use service principals, they must exist in your Azure AD. Managed identities are also a feature of Azure AD

Service Principal

To use a service principal (SP), you must first create the SP (1). Then grant it access to your workspaceb (2).

Important: When using a service principal, grant it the minimum access required for the task it is used for. For example, you would not grant a service principal owner or contributor access if all it is used for is reading the access token for a web deployment.

The reason for granting the least access is that a service principal uses a password to authenticate, and the password may be stored as part of an automation script. If the password is leaked, having the minimum access required for a specific tasks minimizes the malicious use of the SP.

The important information to keep after creating a Service Principal is:

  1. AZURE_CLIENT_ID
  2. AZURE_TENANT_ID
  3. AZURE_CLIENT_SECRET

How do you store Service Principal information?

from dotenv import load_dotenv

if ( os.environ['ENVIRONMENT'] == 'development'):
print("Loading environment variables from .env file")
load_dotenv(".env")

from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
# Check if given credential can get token successfully.
credential.get_token("https://management.azure.com/.default")

Using DefaultAzureCredential to create the credential object, then using MLClient to connect to the workspace:

from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
# Check if given credential can get token successfully.
credential.get_token("https://management.azure.com/.default")

try:
ml_client = MLClient.from_config(credential=credential)
except Exception as ex:
# NOTE: Update following workspace information to contain
# your subscription ID, resource group name, and workspace name
client_config = {
"subscription_id": "<SUBSCRIPTION_ID>",
"resource_group": "<RESOURCE_GROUP>",
"workspace_name": "<AZUREML_WORKSPACE_NAME>",
}

# write and reload from config file
import json, os

config_path = "../.azureml/config.json"
os.makedirs(os.path.dirname(config_path), exist_ok=True)
with open(config_path, "w") as fo:
fo.write(json.dumps(client_config))
ml_client = MLClient.from_config(credential=credential, path=config_path)

print(ml_client)

Creating a Managed Identity

Head over to the Portal and search for Managed Identities. Click on “Create” and select your subscription and resource group where your workspace and compute are. Select the closest region to where you are or where your service will be called. Finally, pick a unique name and hit “Create”. It should look like this:

Managed identity creation from Azure Portal.

Once the managed identity creation finishes, you can head over to your Azure Machine Learning Studio, to the Compute section and select the compute cluster you created in the previous section. In my case it was called “cpu-cluster”. Now click on the edit icon on the Managed Identity section of your compute cluster and select the newly created managed identity.

This is just one way to authenticate your cluster, but I found that it fixed the issue where the DefaultCredential was not being able to retrieve a proper authentication for my job.

Creating a Custom Environment

When working on any software project, creating an environment with all the dependencies and versions of the packages used is a big favor you do to yourself and others in the future with regards to reliability and reproducibility. It is the classical situation where you finish a project, come to it a year later and realize nothing works. Why? Because you did not use an environment.

The most popular ways to create them are with conda or venv. However, Azure ML offers two options. Curated and Custom environments. If your project fits one of the curated environments, do not complicate matters by creating a custom one, but most often you’ll need to create a custom environment.

The easier way in my opinion is to create:

  1. Docker build context
  2. Dockerfile
  3. requirements.txt

Let’s unpack this a little bit.

What is a Docker build context?

To simplify matters we will say the Docker build context is a directory that contains certain files relevant during the build. Azure ML asks us for a Dockerfile containing the instructions for the build and the requirements.txt with the necessary packages inside. So it will look something like this:

.
└── build-context/
├── Dockerfile
└── requirements_azureml.txt

An example can be found in the repository included at the beginning of this article.

The Azure ML SDK v2 offers the Environment class which accepts the following arguments:

  • build: we’ll point to a BuildContext and provide the path;
  • name: how do we want it to be called?
  • description: this is useful in case someone else in the team wants to use this environment.

Here is an example of the code to create the environment:

from azure.ai.ml import MLClient
from azure.ai.ml.entities import Environment, BuildContext
from azure.identity import DefaultAzureCredential
from dotenv import load_dotenv # useful for not hard-coding our info
import os

load_dotenv() # loads the variables in a .env file

def get_workspace(verbose=False):
load_dotenv()

subscription_id = os.environ['SUBSCRIPTION_ID']
resource_group = os.environ['RESOURCE_GROUP']
workspace_name = os.environ['WORKSPACE_NAME']
credential = DefaultAzureCredential()

if verbose:
print(f"Resource Group: {resource_group} | Subscription: {subscription_id} | {workspace_name}")

ml_client = MLClient(
credential=credential, subscription_id=subscription_id, resource_group_name=resource_group, workspace_name=workspace_name,
)

return ml_client

ml_client = get_workspace()

print("Begin creation of Azure ML environment...")
env_docker_context = Environment(
build=BuildContext(path="docker-context"),
name="fruit_env",
description="Environment created from a Docker context.",
)

ml_client.environments.create_or_update(env_docker_context)

print("Done creating environment!")

Keep an eye out for common pitfalls during this process. Wrapping your code in try/except blocks, as shown above, can help handle errors gracefully for more comprehensive strategies on error handling in Azure.

Now you can go to your Azure ML studio an in the Environments section under Custom Environments you’ll see the newly created environment.

Custom environment as seen in the Environments section in the Azure Machine Learning workspace.

Launching a training job through Command

Now we have all the ingredients necessary to launch a job to train the pretrained ResNet18 model on our training and test sets. Just to recap, we created a storage account to upload our training data and created a data asset to easily access it during a training job. We also created a compute cluster and assigned it a managed identity. In order to run our training code, we created an Environment with the Python and libraries’ versions using a Docker Context.

It is a common practice to create a python script (in my case I named it run_job.py) that will instantiate the MLClient and gather all that is needed to run our job. After this, we have all that’s needed to run a command. As you will see it does not take many lines of code to do it:

import argparse
import random
import os
from azure.ai.ml import command, Input
from azure.ai.ml.constants import AssetTypes, InputOutputModes
from utils.azure_utils import get_workspace

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--experiment_name", default=f"fruit_classification_training")
parser.add_argument("--n_epochs", type=int, default=3)
args = parser.parse_args()

ml_client = get_workspace()

print(ml_client._credential)

dataset_path = os.environ[
"STORAGE_DATASET_PATH" # path structure: azureml://datastores/<storage_account_name>/paths/<container_name>
]
data_type = (
AssetTypes.URI_FOLDER # specifies that that data asset will be a directory, instead of a file
)
mode = InputOutputModes.RO_MOUNT # only read necessary

inputs = {
"input_data": Input(type=data_type, path=dataset_path, mode=mode),
"n_epochs": args.n_epochs,
}

command_job = command(
code="./",
command="python train.py --data_dir ${{inputs.input_data}} --n_epochs ${{inputs.n_epochs}}",
inputs=inputs,
environment="fruit_env@latest",
compute="cpu-cluster",
name=f"{args.experiment_name}_{args.n_epochs}epochs_{random.randint(4000, 50000)}",
)

ml_client.jobs.create_or_update(command_job)

Let’s break it down piece by piece:

import argparse
import random
import os
from azure.ai.ml import command, Input
from azure.ai.ml.constants import AssetTypes, InputOutputModes
from utils.azure_utils import get_workspace

Here the important part is that in the Python SDK v2 we access most modules through azure.ai.ml. For example, infrom azure.ai.ml.constants import AssetTypes, InputOutputModes we import specific constants (AssetTypes and InputOutputModes) from the constants submodule of the azure.ai.ml module.

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--experiment_name", default=f"fruit_classification_training")
parser.add_argument("--n_epochs", type=int, default=3)
args = parser.parse_args()

ml_client = get_workspace()

print(ml_client._credential)

dataset_path = os.environ[
"STORAGE_DATASET_PATH" # path structure: azureml://datastores/<storage_account_name>/paths/<container_name>
]
data_type = (
AssetTypes.URI_FOLDER # specifies that that data asset will be a directory, instead of a file
)
mode = InputOutputModes.RO_MOUNT # only read necessary

inputs = {
"input_data": Input(type=data_type, path=dataset_path, mode=mode),
"n_epochs": args.n_epochs,
}

There could be more arguments but this is a toy example so I decided to keep it simple. We then call the get_workspace() function to instantiate our MLClient. Remember, we use the MLClient to access all kinds of resources and assets but also to send jobs. It should always be instantiated first.

We then define the dataset path, which is saved as an environment variable. The comment includes how you should structure the path to your data. In the case of image classification, we have the data stored in several “folders”, so we set AssetType to URI_FOLDER. We will access the data as read-only, therefore we set InputOutputModes to RO_MOUNT.

Inputs is necessary for the command function that is used to gather all necessary information for the job. Inside, we define the arguments to pass to the command, to be easily accessible. In my case I needed the data directory and the number of epochs, but you can add as many as you like by adding them to the inputs dictionary.

command_job = command(
code="./",
command="python train.py --data_dir ${{inputs.input_data}} --n_epochs ${{inputs.n_epochs}}",
inputs=inputs,
environment="fruit_env@latest",
compute="cpu-cluster",
name=f"{args.experiment_name}_{args.n_epochs}epochs_{random.randint(4000, 50000)}",
)

ml_client.jobs.create_or_update(command_job)
  • code: path to the folder where the script you want to run is. In my case train.py is in the same folder as run_job.py.
  • command: how you would run the script in the command line. However, you pass arguments by using ${{inputs.<your_input>}}
  • inputs: the dictionary we created previously containing the name in the keys and the input as the value.
  • environment: the environment we created in the workspace using the Docket Context. If you used a curated environment you can just copy the name.
  • compute: the name of the compute cluster you created for this.
  • name: this will be name of the job you see in the Studio, otherwise it will get a random named assigned to it. Assigning descriptive names to jobs isn’t just for aesthetics. It helps in easier monitoring and debugging, especially when managing multiple jobs. I settled for a simple experiment_name + n_epochs + a random integer.

We finally, can just call the create_or_update method with the command_job and you should be able to see it from the Studio. It will take some time to queue the compute cluster the first time. When the job is running you will be able to see the results in the output + logs section.

Conclusion

As we wrap up this guide, it’s crucial to remember that the beauty of Azure lies in its adaptability and expansiveness, it has some downsides such as its complexity paired with its lack of documentation for non-trivial use cases. The steps provided above are a foundational pathway, but the platform offers much more to explore and utilize. By embracing best practices and understanding core Azure functionalities, you’re not only setting up your image classification models for success but also ensuring a smoother and more informed Azure experience. Keep experimenting, keep learning, and remember that every challenge on this journey is an opportunity to grow as a data scientist.

--

--