Machine learning basics

2 min readAug 13, 2022

Even before we get into basics, one must have a look at this image below so as to have a broad understanding of the commonly used terms about programming.

Coming to the actual topic, Machine learning simply means training software (also called a model) to make useful predictions from the data. For instance,it is used for weather prediction.

Types of ML systems-

These fall into three distinct categories-

1. Supervised learning

Involves classification and regression. Classification predicts if something belongs to a particular category (spammed emails for instance), and it could be binary or multiclass classification; whilst regression predicts a numeric value (future house price prediction and ride time estimation, for instance). It is further of following two types-

a) Binary

Makes an optimum choice from the two given options.

b) Multiclass

Yields single output from multiple categories/inputs.

2. Unsupervised learning

Used to identify the hidden patterns in the data. The model identifies the label using hidden patterns of the features. The most used technique is called clustering.

c) Clustering

When the data needs to be classified into groups based on the numerical outputs.

3. Reinforcement learning

Makes predictions by getting rewards or penalties.

Key terms-

(1) Data

In data, there are features and labels. For example, in case of weather data prediction, Features= values (such as latitude, longitude, temperature, humidity, cloud coverage, wind direction, and atmospheric pressure, etc.). Label (prediction value) = rainfall amount.

There could be unlabeled models too. The label is predicted once the model is created.

(2) Dataset charactersitics

A large dataset with high diversity is ideal for machine learning. But datasets with more features does not always produce better models.

(3) Model

Complex collection of numbers that defines the mathematical relationships that define the relationship between the input features to the output values.

(4) Training

The objective of the training is to reduce the loss (the difference between predictive values and the actual values). This helps to make better predictions for the unseen data. Depending on the correlation amongst the factors, some of the unimportant factors can be removed as they would now change the prediction of the model.

(5) Evaluating

The performance of a model is compared to the actual values.

(6) Inference

The predictions are also called inferences.

Apart from this, it is suggested to visit the glossary from time to time.