What is Machine Learning?

Muktha Sai Ajay
DataSeries
Published in
6 min readSep 29, 2020

All you need to know about Machine Learning

Photo by Andy Kelly on Unsplash

Introduction

Artificial Intelligence (AI) makes it possible for machines to learn from experience, adjust to new inputs, and perform human-like tasks. People across different domains are trying to apply AI to make their jobs a lot easier.

These fields are developing rapidly in recent years. We see both of them in our life daily. We can observe many applications, ranging from medical applications to customer recommendations. The fields of AI are making a significant breakthrough that no one has ever imagined.

For example, doctors use AI applications to provide personalized medicine and X-ray readings. In retail, AI provides virtual shopping capabilities that offer personalized recommendations and discuss the purchase options with the consumer. Banking AI techniques can identify which transaction is likely to be fraudulent.

Machine Learning

It is the development of computer programs that allow computers to learn automatically without human intervention. We can call it as a tool for transforming information into knowledge. From the past 40 years, a lot of data is generated and is kept aside and not used for analyzing purposes. Here comes the role of Machine Learning; these techniques are used to analyze the data, extract meaningful insights, visualize data, and find many more valuable patterns in data. We can even predict future events using the data provided through which we can make decisions.

Applications

We are already interacting with Machine Learning every day; sounds surprised? It’s true, whenever you are searching for an item and when you reopen your device, you can find the advertisements regarding it. When it comes to Netflix, with the help of recommendation algorithms, it can recommend movies on your favorite genre. Some applications of Mchine Learning includes

  • Advanced machines will be delivered an accurate diagnosis of a patient.
  • It helps you in classifying whether the received mail is spam or not.
  • Voice recognition, image search is possible due to Machine Learning.
  • Search engines offer recommendations based on your previous search.

Terminology in Machine Learning

1. Dataset

It is a collection of data in a tabular form. It is composed of rows and tables. Every column describes the particular feature, and every row corresponds to the value of that feature. Generally, the dataset will be in CSV(Comma Separated Value) format. We can even use TSV(Tab Separated Values) and many other types.

2.Train Set

The dataset is divided into test and train. The training dataset is considered for training our model so that our model learns the relations between the data and can provide better predictions.

3.Test Set

This data set is used to measure how well our model is performed by comparing the expected results and our data.

4.Validation Set

A subset of the dataset from the training set is considered for validation.

5.Accuracy

Accuracy is a measure of how well the model is performed.

6. Categorical data

Categorical data has discrete qualitative values. For example, the colors of the cars are red, blue, white, black.

7.Feature Engineering

The process of determining which features might be useful during training a model

8.Imputation

It is a technique used for handling missing values in the data.

Steps in Machine Learning

Photo by Franki Chamaki on Unsplash

Get the Data

Based on your problem statement, you need to collect the data as per it. The quality and quantity of data gathered determines your model performance.

Data Preparation

The collected data may not be in the required form. For example, a machine learning model won’t accept categorical data, and it would be a problem if the data has missing values, and when the data is not normalized.

In this step, we need to fill the missing data. If necessary, we can drop those missing values. We need to convert the categorical values into numerical values. All values on the independent side should be categorized. It would be difficult for the mode to take it as input and train.

Choosing a Model

We have many machine learning models, and model selection plays a vital role in determining the predictive power of the problem statement. For continuous value prediction, we generally use regression models, and for categorical class prediction, we use classification algorithms like Logistic Regression, Decision Trees, and many more.

Training

Once the data organized, we divide the data into a train set and test set based on randomness. We train our model with our training data to understand the relations within the data and make better predictions.

Evaluating

Once our model is trained, we evaluate the model performance using test data. We make use of some evaluation metrics to measure the performance of our model. For example, in classification models, we make use of Precision, Recall, F1 score.

Parameter Tuning

If the evaluation is successful, then we proceed with the next step. If the evaluation result is low, there are chances that our model has a drawback of underfitting or overfitting. In this case, make sure you adjust the learning rate of the model and also increase the number of echos and try changing the other hyper-parameters.

Prediction

Prediction is the final stage of our model, where we consider whether our model is ready to face real-world or practical applications. This stage is what the end-users see when they use the machine learning model within their respective industry.

Types of Machine Learning

Supervised Machine Learning

Supervised Learning is where the data is labeled, and the program learns to predict the output from the input data.

Regression:

In regression problems, we are trying to predict a continuous value. Examples are:

  • What is the housing price in Tokyo?
  • What is the value of cryptocurrencies?

Classification:

In classification problems, we are trying to predict a discrete number of values. Examples are:

  • Is this email spam?

Unsupervised Learning

Unsupervised Learning is a type of machine learning where the model learns the structure of the data based on unlabeled examples.

Clustering is a common unsupervised machine learning approach that finds patterns and structures in unlabeled data by grouping them into clusters.

Examples

  • Consumer sites clustering users for recommendations
  • Search engines to group similar objects in one cluster.

Semi-supervised Learning

Supervised learning algorithms are trained on datasets that have labels that are added by domain experts. Unsupervised Algorithms are trained on unlabelled data to determine feature importance on their own patterns using data.

Semi-supervised learning algorithms are trained on a combination of labeled and unlabeled data. It consists of a small amount of labeled data and a large amount of unlabelled data.

Reinforcement Learning

Reinforcement Learning(RL) is a technique that enables an agent to learn in an interactive environment using feedback from its own actions and experiences. It uses rewards and punishments as signals for positive and negative feedback.

Credits:researchgate.net

Reinforcement Learning has four essential elements:

  1. Agent. The program you train to perform a specific task.
  2. Environment: A place where an agent learns and performs an action.
  3. State — It refers to the environment of the agent.
  4. Action. A move made by the agent, which changes its state in the environment.
  5. Rewards. The evaluation of an action, which can be positive or negative.

Thank you for reading my article. I will be happy to hear your opinions. Follow me on Medium to get updated on my latest articles. You can also connect with me on Linkedin and Twitter. Check out my blogs on Machine Learning and Deep Learning.

--

--