Introduction To Machine Learning (ML)

11 min readJan 27, 2024

Artificial intelligence :-

Artificial intelligence is a field of science concerned with building computers and machines that can reason, learn, and act in such a way that would normally require human intelligence or that involves data whose scale exceeds what humans can analyze.

AI is a broad field that encompasses many different disciplines, including computer science, data analytics and statistics, hardware and software engineering, linguistics, neuroscience, and even philosophy and psychology.

On an operational level for business use, AI is a set of technologies that are based primarily on machine learning and deep learning, used for data analytics, predictions and forecasting, object categorization, natural language processing, recommendations, intelligent data retrieval, and more.

Artificial intelligence (AI) makes it possible for machines to learn from experience, adjust to new inputs and perform human-like tasks. Most AI examples that you hear about today from chess-playing computers to self-driving cars rely heavily on deep learning and NLP. Using these technologies, computers can be trained to accomplish specific tasks by processing large amounts of data and recognizing patterns in the data.

Types of artificial intelligence :-

Based On Capabilities :

Narrow AI: Narrow AI, also known as weak AI, is an application of artificial intelligence.
General AI : General AI is the intelligence of machines that allows them to comprehend, learn, and perform intellectual tasks much like humans.
Super AI : Super AI is also known as Artificial super intelligence (ASI) Theoretically, ASI’s superior capabilities would apply across many disciplines and industries and include cognition, general intelligence, problem-solving abilities, social skills and creativity.

Based On Functionalities :

1. Reactive Machines

Reactive machines are AI systems that have no memory and are task specific, meaning that an input always delivers the same output machine learning model tend to be reactive machines because they take customer data, such as purchase or search history, and use it to deliver recommendations to the same customers.

Example :Netflix recommendations: Netflix’s recommendation engine is powered by machine learning models that process the data collected from a customer’s viewing history to determine specific movies and TV shows that they will enjoy. Humans are creatures of habit — if someone tends to watch a lot of Korean dramas, Netflix will show a preview of new releases on the home page.

2. Limited memory machines

The next type of AI in its evolution is limited memory. This algorithm imitates the way our brains’ neurons work together, meaning that it gets smarter as it receives more data to train on. Deep ;earning algorithms improve natural language processing (NLP), image recognition, and other types of reinforcement learning.

Example: Self-driving cars: A good example of limited memory AI is the way self-driving cars observe other cars on the road for their speed, direction, and proximity. This information is programmed as the car’s representation of the world, such as knowing traffic lights, signs, curves, and bumps in the road. The data helps the car decide when to change lanes so that it does not get hit or cut off another driver.

3. Theory of mind

The first two types of AI, reactive machines and limited memory, are types that currently exist. Theory of mind and self-aware AI are theoretical types that could be built in the future. As such, there aren’t any real world examples yet. If it is developed, theory of mind AI could have the potential to understand the world and how other entities have thoughts and emotions. In turn, this affects how they behave in relation to those around them.

Human cognitive abilities are capable of processing how our own thoughts and emotions affect others, and how others’ affect us this is the basis of our society’s human relationships. In the future, theory of mind AI machines could be able to understand intentions and predict behavior, as if to simulate human relationships.

4. Self-awareness

The grand finale for the evolution of AI would be to design systems that have a sense of self, a conscious understanding of their existence. This type of AI does not exist yet. This goes a step beyond theory of mind AI and understanding emotions to being aware of themselves, their state of being, and being able to sense or predict others’ feelings. For example, “I’m hungry” becomes “I know I am hungry” or “I want to eat lasagna because it’s my favorite food.”

Artificial intelligence and machine learning algorithms are a long way from self-awareness because there is still so much to uncover about the human brain’s intelligence and how memory, learning, and decision-making work.

Machine Learning :-

Machine learning (ML) is a subdomain of artificial intelligence (AI) that focuses on developing systems that learn or improve performance based on the data they ingest. Artificial intelligence is a broad word that refers to systems or machines that resemble human intelligence. Machine learning and AI are frequently discussed together, and the terms are occasionally used interchangeably, although they do not signify the same thing. A crucial distinction is that, while all machine learning is AI, not all AI is machine learning.

Machine Learning is the field of study that gives computers the capability to learn without being explicitly programmed. ML is one of the most exciting technologies that one would have ever come across. As it is evident from the name, it gives the computer that makes it more similar to humans: The ability to learn. Machine learning is actively being used today, perhaps in many more places than one would expect.

Types of Machine Learning :

here are 4 types of machine learning, they are Supervised learning, Unsupervised learning, semi-supervised learning, reinforcement learning.

Supervised learning : Supervised learning is a type of machine learning where the algorithm is trained on a labeled dataset, which means that the input data has corresponding output labels. The goal of supervised learning is to learn a mapping from input features to the correct output labels, so that the algorithm can make accurate predictions on new, unseen data.

Supervised learning can be further categorized into Three main types:

a) Classification: In classification tasks, the machine learning program must draw a conclusion from observed values and determine to
what category new observations belong. For example, when filtering emails as ‘spam’ or ‘not spam’, the program must look at existing observational data and filter the emails accordingly.

b) Regression: In regression tasks, the machine learning program must estimate — and understand — the relationships among variables. Regression analysis focuses on one dependent variable and a series of other changing variables — making it particularly useful for prediction and forecasting.

c) Forecasting: Forecasting is the process of making predictions about the future based on the past and present data, and is commonly used to analyses trends.

2. Unsupervised learning :

Unsupervised learning is a type of machine learning where the algorithm is given data without explicit instructions on what to do with it. In unsupervised learning, the algorithm tries to find patterns, relationships, or structures within the data without the presence of labeled output. The objective is often to explore the inherent structure of the data or to group similar data points together.

Unsupervised learning can be further categorized into two main types:

a) Clustering: Clustering involves grouping sets of similar data . It’s useful for segmenting data into several groups and performing analysis on each data set to find patterns.

Example: K-means clustering is a popular algorithm used for this purpose. It divides the data into k clusters, where each cluster represents a group of similar data points.

b) Dimension reduction: Dimension reduction reduces the number of variables being considered to find the exact information required.

Example: Principal Component Analysis (PCA) is a common technique used for dimensionality reduction. It identifies the principal components that capture the most variability in the data.

3. Semi-Supervised Learning :

Semi-supervised learning is a type of machine learning that falls between supervised learning and unsupervised learning. In semi-supervised learning, the algorithm is trained on a dataset that contains both labeled and unlabeled examples. The availability of a small amount of labeled data and a larger amount of unlabeled data is a common scenario in many real-world applications.

The goal of semi-supervised learning is to leverage both the labeled and unlabeled data to improve the performance of the model.

4. Reinforcement Learning :

Reinforcement Learning (RL) is a type of machine learning paradigm where an agent learns to make decisions by interacting with an environment. The agent receives feedback in the form of rewards or penalties, which guides its learning process. The goal of reinforcement learning is for the agent to learn a strategy, called a policy, that maximizes the cumulative reward over time.

The most common and popular machine learning algorithms:

Naïve Bayes Classifier Algorithm (Supervised Learning — Classification)
K Means Clustering Algorithm (Unsupervised Learning —
Support Vector Machine Algorithm (Supervised Learning — Classification)
Linear Regression (Supervised Learning/Regression)
Logistic Regression (Supervised learning — Classification)
Artificial Neural Networks (Reinforcement Learning)
Decision Trees (Supervised Learning — Classification/Regression)
Random Forests (Supervised Learning — Classification/Regression)
Nearest Neighbors (Supervised Learning)

Deep Learning :-

Deep learning can be considered as a subset of machine learning. It is a field that is based on learning and improving on its own by examining computer algorithms. While machine learning uses simpler concepts, deep learning works with artificial neural networks, which are designed to imitate how humans think and learn. Until recently, neural networks were limited by computing power and thus were limited in complexity. However, advancements in big data analytics have permitted larger, sophisticated neural networks, allowing computers to observe, learn, and react to complex situations faster than humans. Deep learning has aided image classification, language translation, speech recognition. It can be used to solve any pattern recognition problem and without human intervention.

Machine Learning Development Life Cycle :

The Machine Learning Development Lifecycle is a framework for building machine learning models that emphasizes the importance of a systematic and iterative approach to development. The MLDLC typically consists of the following stages:

Problem Statement (or) Framing the problem :

In this stage, you decide what the problem is and how to solve it. Who is your customer? How much will it cost? How many people will be on the team? What will the end product look like? Which machine learning model will you be using? Where to deploy it? What framework will be used? From where will the date come? What will be the source of data?

Gathering Data (or) Data Collection :

Gather relevant data from various sources, considering data quality and availability.

APIs-Hit the API using Python code and fetch data in JSON format.

Web Scrapping: Sometimes data is not publicly available, i.e., it is on some website, so we need to extract it from there. For example Trivago uses this method to collect hotel prices data from every website

Data Warehouse-Data is also stored in data bases. But this data cannot be directly used as it is running data. So data from a database is stored in a data warehouse and then used.

Clusters: Also, data is sometimes stored in tools like Spark in the form of clusters, which are basically big data, so data is fetched through these clusters.

Data Pre-Processing:

when you are taking data from external sources that are bound to be unclean or dirty. You cannot use that data directly. You cannot pass on this data directly to a machine learning model because the result is not good. Data can have structural issues. it can have missing data, can Contain some outliers & noises

So here, you need data preprocessing.it involves removing duplicates, removing missing values, removing outliers, and scaling the values (standardization).The core idea behind data preprocessing is to bring data in such a format that it will be easily consumed by your machine learning model.

Exploratory Data Analysis [EDA]:

In this stage, you analyze data, which means you try to study the relationship between input and output variables. The whole idea is that you have to make ML-based software. before making it you need to know “what is in your data” if you don’t know this you cannot make model properly

At this stage, you need to perform. In many experiments with data, you have to extract hidden relationships from the data. This stage gives data insights by data visuals, univariate analysis, bi-variate analysis, multi-variate analysis, outlier detection, handle imbalanced dataset. The whole idea behind this stage is to get a concrete idea about duty. The more time we spend on EDA, the more we get to know about data, which helps in decision-making while implementing models.

Feature Engineering And Feature Selection:

features are the input columns. Features are Important because output is depends upon the Input (features). The idea behind feature engineering is that sometimes you create new columns in data by using existing columns to make intelligent changes to existing columns to make analysis easier.

Assume you want to predict house prices and have input columns such as number of rooms, number of bathrooms, locality, and so on. In this scenario, you remove the number of rooms and bathrooms and replace them with a single column called “Sqft,” which represents the number of rooms and bathrooms.so what is the benefit? you have only one column instead of 2 this is called feature engineering

feature selection-

Sometimes you have more features, like 100 or 200, but you cannot proceed with all features because of two reasons.

These features are not helpful. for building model not necessary every input affects the output you need to remove those features that are not impacting your off when you select features
With more columns, it will take more time to train the model, so by removing irrelevant columns, you can save time.

Both feature engineering & selection are crucial

Training And Evaluation:

Once you’re sure about your data, you are ready to train the model. you try different diff. machine learning algorithm you train that algorithm by your data every algorithm. In general, no one does such a thing. that someone trains only one So to be honest everyone knows that is Any one algorithm is good for any one type of data. But you never Know which algorithm turns out to be good for any particular data.

Deployment:

In this step, we deploy the model in a real-world system. If the above-prepared model is producing an accurate result as per our requirement with acceptable speed, then we deploy the model in the real system. But before deploying the project, we will check whether it is improving its performance using the available data or not. For deployment we can use Heroku, Amazon Web Services, Google Cloud Platform etc. Now our model is online and serves user requests.

Testing:

In this step, we check the accuracy of our model by providing a test dataset. Testing the model determines the percentage accuracy of the model as per the requirements of the project or problem.

Optimizing:

In this stage, companies use servers to take backups of models, backups of data, load balancing (service the request if many users are requesting it), and rotting (frequently re-training models as data evolves with time).This step is generally automated.

Conclusion

The Machine Learning Development Lifecycle provides a structured and systematic approach to building machine learning models that can help improve their accuracy, reliability, and relevance over time. By following a well-defined process, developers can ensure that their models are built using best practices and that they are well-suited to the problem at hand. This can help businesses harness the power of machine learning to solve complex problems, drive innovation, and gain a competitive edge in the digital age.