Introduction to AI ML jargons

Published in

Analytics Vidhya

5 min readOct 8, 2020

Hello everyone, in my current organisation I have been involved in working on making our products smarter. The work mostly involves projects which requires playing around with customer data and making smarter recommendations and decisions.

I felt sharing and penning it down in a blog would be really good exercise for me as well as any aspiring tech enthusiast who is new to learning AI ML.

I will be sharing a series of blogs on this vast topic and this is the part 1 of that series.

Data is Key in making intelligent decisions

What is AI?

AI (Artificial Intelligence) is the idea that machines can possess intelligence. As there is no definition of intelligence, and humans consider themselves as the most intellectual species on the planet, the machine which possesses human-like behaviour is said to be intelligent.

What is Machine Learning?

Machine learning is the technique to induce knowledge into the machine. Inducing knowledge to someone is often called learning and hence the name. An ML Model is an alias for a trained (or yet to be trained) model which is expected to perform some intelligent stuff. For eg. A chat bot is an ML model which may comprise a Neural Net to interpret speech and convert it into text and another statistical model to filter the keywords of the converted speech query.

What is a machine learning model?

A machine learning model is a file that has been trained to recognize certain types of patterns. You train a model over a set of data, providing it an algorithm that it can use to reason over and learn from those data.

Once you have trained the model, you can use it to reason over data that it hasn’t seen before, and make predictions about those data. For example, let’s say you want to build an application that can recognize a user’s emotions based on their facial expressions. You can train a model by providing it with images of faces that are each tagged with a certain emotion, and then you can use that model in an application that can recognize any user’s emotion.

Training model logic

To train the model one should first work on getting the datasets.

We get a dataset, we split the dataset into 3 parts.

Training
Validation
Test

The question is, do we split it evenly. How do practitioners approach this? There is no rule for this but the common divisions are:

The data set we train the model with should be considerably larger. We want to devote as much data as possible for the training of the model while having enough samples to test and validate.

We explain each dataset as following:

Training set: This is used to build up our prediction algorithm. Our algorithm tries to tune itself to the quirks of the training data sets. In this phase we usually create multiple algorithms in order to compare their performances during the Validation Phase.

Validation: This data set is used to compare the performances of the prediction algorithms that were created based on the training set. We choose the algorithm that has the best performance.

Test: Now we have chosen our preferred prediction algorithm but we don’t know yet how it’s going to perform on completely unseen real-world data. So, we apply our chosen prediction algorithm on our test set in order to see how it’s going to perform so we can have an idea about our algorithm’s performance on unseen data.

How does training and validation work together?

“Every now and then” we validate the model logic by running the model with the validation dataset.

What does “every now and then mean”? Usually we validate the dataset for every epoch. Everytime we adjust the weights and training loss, we validate.

The following diagram represents when we should train and validate and when we should stop the validation.

After we have trained and validated the model, it’s time to measure its predictive powers. This is done by running the model on a data set it has never seen before. That’s equivalent to applying the model in real life.

The accuracy that we get from test data is the accuracy that we can expect from the deployment of the model in real life. So the test dataset is the last step we take.

Example:

One way to think of these three sets is that two of them (training and validation) come from the past, whereas the test set comes from the “future”. The model should be built and tuned using data from the “past” (training/validation data), but never test data which comes from the “future”.

To give a practical example, let’s say we are building a model to predict how well baseball players will do in the future. We will use data from 1899–2020 to create a training and validation set. Once the model is built and tuned on that data, we will use data from 2019 (actually in the past!) as a test set, which from the perspective of the model appears like “future” data and in no way influenced the model creation.