Stories by Kamal Singh Rathore on Medium

Quantum Computing : Part 1 Understanding classical system

Kamal Singh Rathore — Wed, 11 Feb 2026 14:27:29 GMT

Quantum Basics

Classical States and Probability Vectors

Classical state describes the exact condition of a system, and it is definite and predictable.

Let’s take an example of the classical state. Suppose the system is X and use the symbol Σ to refer to the set of classical states of X. Σ is finite and non-empty.

Classical states are represented by probability distributions whose values are always finite and non-negative, whereas quantum states are described by complex probability amplitudes and only their squared magnitudes correspond to non-negative classical probabilities.

Example of Classical State

If X is a bit, then Σ = {0, 1}

If X is a six-sided die then Σ = {1, 2, 3, 4, 5, 6}

Often in information processing, our knowledge is uncertain. To represent these uncertainties, we can associate probabilities with different classical states, resulting in what we shall call a probabilistic state.

For example, X is a bit and based on experience, we assume the probability of X being in state 0 is 3/4 and in 1 is 1/4.

Probability Representation

Pr(X = 0) = 3/4

Pr(X = 1) = 1/4

A more succinct way to represent the probabilistic state is by a column vector:

[[3/4], [1/4]]

Any probabilistic state can be represented through a column vector satisfying:

1. All entries of the vector are nonnegative real numbers.

2. The sum of the entries is equal to 1.

Vectors of this form are called probability vectors.

Measuring Probabilistic States

By measuring a system, we simply mean that we look at the system and recognize whichever classical state it is in without ambiguity. Intuitively speaking, we can’t see a probabilistic state of a system; when we look at it, we just see one of the possible classical states.

If we recognize that X is in the classical state a ∈ Σ, then the new probability vector representing our knowledge of the state of X becomes the vector having a 1 corresponding to a and 0 for all other entries. This vector indicates that X is in the classical state a with certainty, and we denote this vector by |a⟩, which is read as ‘ket a’.

|0⟩ = [[1], [0]]

|1⟩ = [[0], [1]]

[[3/4], [1/4]] = 3/4 |0⟩ + 1/4 |1⟩

The probabilistic states describe knowledge or belief, not necessarily something actual, and measuring merely changes our knowledge and not the system itself.

Joins in SQL

Kamal Singh Rathore — Sun, 17 Nov 2024 06:57:07 GMT

In this article we are going to cover one of the most important concept in SQL that is “Joins”

You can think of joins as building blocks for fetching data from multiple tables based on some common column values and condition (conditions are optional).

We will go through different types of joins from basic to advance.

For explaining the joins, I am are going to use two tables, one for those students who are enrolled in Math course and other who have Enrolled in Arts Course. Also, we have provided Employee data for explaining Self and Cross Join

We have table structure like below and Added Venn Diagram for better understanding.

CREATE TABLE MathsEnrollments (
    StudentID INT PRIMARY KEY,
    Name VARCHAR(100),
    EnrolledDate DATE
);


CREATE TABLE ArtsEnrollments (
    StudentID INT PRIMARY KEY,
    Name VARCHAR(100),
    EnrolledDate DATE
);


INSERT INTO MathsEnrollments (StudentID, Name, EnrolledDate) VALUES
(1, 'Alice', '2024-01-10'),
(2, 'Bob', '2024-01-12'),
(3, 'Charlie', '2024-01-15'),
(4, 'David', '2024-01-18'),
(5, 'Eva', '2024-01-20'),
(6, 'Frank', '2024-01-22'),
(7, 'Grace', '2024-01-25'),
(8, 'Helen', '2024-01-28'),
(9, 'Ian', '2024-01-30'),
(10, 'Jack', '2024-02-02');


INSERT INTO ArtsEnrollments (StudentID, Name, EnrolledDate) VALUES
(3, 'Charlie', '2024-01-15'),
(5, 'Eva', '2024-01-20'),
(7, 'Grace', '2024-01-25'),
(8, 'Helen', '2024-01-28'),
(9, 'Ian', '2024-01-30'),
(11, 'Karen', '2024-02-05'),
(12, 'Liam', '2024-02-07'),
(13, 'Mia', '2024-02-10'),
(14, 'Nathan', '2024-02-12'),
(15, 'Olivia', '2024-02-15');



# THIS IS FOR SELF JOIN AND CROSS JOIN

CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    Name VARCHAR(50),
    ManagerID INT,
    Department VARCHAR(50)
);

INSERT INTO Employees (EmployeeID, Name, ManagerID, Department) VALUES
(1, 'Alice', NULL, 'HR'),
(2, 'Bob', 1, 'IT'),
(3, 'Charlie', 1, 'IT'),
(4, 'David', 2, 'Finance'),
(5, 'Eve', 2, 'Finance');

Left Join

This type of join return all record from the first table and only the matching records from the second table.

SELECT * FROM MathsEnrollments M
LEFT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID

Right Join

This type of join returns all the records from the second table and only matching records from the first table

SELECT * FROM MathsEnrollments M
RIGHT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID

Inner Join

This type of join only returns the matching records from both the table

SELECT * FROM MathsEnrollments M
INNER JOIN ArtsEnrollments A ON M.StudentID=A.StudentID

Full join

This type of join returns all the records from both the table

SELECT *  FROM MathsEnrollments M
FULL JOIN ArtsEnrollments A ON M.StudentID=A.StudentID

Cross join

In this type of join, all the records from the table A are matched with all the records in table B.

SELECT E1.Name AS EmployeeName, E2.Name AS AnotherEmployee
FROM Employees E1
CROSS JOIN Employees E2;

Self join

When we join a table with itself, this is known as self join

SELECT E1.Name AS Employee, E2.Name AS Manager
FROM Employees E1
LEFT JOIN Employees E2
ON E1.ManagerID = E2.EmployeeID;

Below are some other variations of joins in SQL

Left join excluding inner join

This is a variation of left join, in this join we return only the unique records from the first table

SELECT * FROM MathsEnrollments M
LEFT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID
WHERE A.StudentID is Null

Right join excluding inner join

This is a variation of right join, we return only the unique records from the second table

SELECT * FROM MathsEnrollments M
RIGHT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID
where M.StudentID is Null

Full outer join, excluding inner join

In this type of join, we return only the unique records from both the table and ignore the common records

SELECT *  FROM MathsEnrollments M
FULL JOIN ArtsEnrollments A ON M.StudentID=A.StudentID
WHERE A.StudentID is Null OR  M.StudentID is Null

You have to try these examples to understand it better. Hope you guys have liked it.

Langchain

Kamal Singh Rathore — Tue, 12 Nov 2024 13:09:56 GMT

Hi everyone, In this article we are going to cover LangChain and important concepts in LangChain.

Let’s start with the most basic question, what is langchain and why do we need it

LangChain is one of the most recent innovation in the field of AI, it was first introduced in October 2022 by Harrison Chase. It is an open source framework for the development of application using LLM (Large Language Models). It serves as a generic interface for nearly any LLM model, we can easily build LLM application and integrate them with external software workflow and data sources.

Langchain have different components that simplify the application development process for the developer by encapsulating the one or more constituent steps into one component.

In langchain we have different components that help us to work with LLM’s. Below is the list of components with examples.

Prompt Templates

Prompts are the guiding force that helps LLM models to generate the output in a particular manner. You can think of them as a set of instruction that helps our model to generate output in the desired format, or helps our model to understand in which format we want output to be. Prompt Template class in Langchain helps us to create prompts for the LLM model without the need to hard code it.

Chains

As the name suggest, it helps to chain together the different functionalities of langchain. If we give you a simple example, suppose you have created a prompt, and you want to run it against a LLM model. You can use Chains to run it. Through Chains we can pass the output of one model as an input for the same or different model depending on the need. In chains we have options like LLMChain, SimpleSequentialChain etc

Indexes

LLM needs access to external data for certain domain specific task. These external data source can be anything a PDF, CSV file, database etc. In LangChain it collectively refers to these data sources as ‘indexes’.

We can process this data using text splitter and store it in vector database, also we can retrieve data from the vector database according to our need.

Memory

LLM does not have memory. It can’t remember the past conversation done by the user. To deal with this problem. LangChain provides the memory functionality, it helps the model to remember the context of the conversation. We can also choose if we want to retain the entire conversation, or we just want to keep a summary of it.

Agents

Agents are the reasoning engine which can automatically choose which actions to take. Its concept is similar to Chain, in which we have a set of task that the Model performs before generating output. The difference here is We use models as a reasoning engine that decides on its own which task to be performed and in which order.

Tools

Tools are the interfaces that an agent or chain can use to interact with the real word entities in order to expand to improve its knowledge base. With tools, we can do many things like internet search, perform mathematical operations etc.

Below is an example that uses LLM model from Hugging Face, Prompt Template, Memory and vector Database.

from getpass import getpass

HUGGINGFACE_API_TOKEN = getpass()
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = HUGGINGFACE_API_TOKEN

#Importing Necessary libraries

from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters.character import RecursiveCharacterTextSplitter
from langchain_core.prompts import PromptTemplate
from langchain.chains import RetrievalQA,LLMChain
from langchain_chroma import Chroma
from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings
from langchain_huggingface.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain.memory import ConversationBufferMemory
from langchain_huggingface import HuggingFaceEndpoint
from langchain_community.vectorstores import FAISS
import faiss
from langchain.memory import ConversationBufferMemory


#Using Web Base loader to get information from the Wiki Website
loader = WebBaseLoader(web_path="https://en.wikipedia.org/wiki/Tata_Motors" )

docs = []

doc_lazy = loader.lazy_load()

for doc in doc_lazy:
    docs.append(doc)

import re

pattern = re.compile('\n+')
patterntab = re.compile('\t+')

docs[0].page_content = re.sub(pattern, ' ',docs[0].page_content)
docs[0].page_content = re.sub(patterntab, ' ',docs[0].page_content)

#Using Character Splitter
splitter = RecursiveCharacterTextSplitter(
    separators=['.'],
    chunk_size=400,
    chunk_overlap=50,
)

document = splitter.split_documents(docs)
len(document[0].page_content)

#Making first model 
model = "sentence-transformers/all-mpnet-base-v2"
embeddings = HuggingFaceInferenceAPIEmbeddings(
    api_key=HUGGINGFACE_API_TOKEN, model_name=model )
prompt = PromptTemplate(
    template="You are a Chatbot expert in Question Answering. answer the given question {human_input}", input_variables=["human_input"])

repo_id = "mistralai/Mistral-7B-Instruct-v0.2"

llm = HuggingFaceEndpoint(
    repo_id=repo_id,
    max_length=128,
    temperature=0.5,
    huggingfacehub_api_token=HUGGINGFACE_API_TOKEN,
)

#Using Chroma Vector Database
database = Chroma.from_documents(document, embeddings)

retriver  = database.as_retriever(search_type="similarity", search_kwargs={"k": 4})

from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

chain = prompt | llm | output_parser

qa_chain_chroma = RetrievalQA.from_llm(llm=chain,retriever=retriver)

human_input= 'Where is the Headqauter of Tata motor'

query= 'Where is the Headqauter of Tata motor'

qa_chain_chroma.run(human_input)
try:
    result = qa_chain_chroma.run(query)
    print(result)
except Exception as e:
    print(f"Error occurred: {e}")

#Uisng seocnd llm model and using model form huggingface
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("deepset/roberta-large-squad2")

model = AutoModelForCausalLM.from_pretrained("deepset/roberta-large-squad2")

repo_id = "deepset/roberta-large-squad2"

llm_robert = HuggingFaceEndpoint(
    repo_id=repo_id,
    max_length=128,
    temperature=0.5,
    huggingfacehub_api_token=HUGGINGFACE_API_TOKEN,
)

import torch

#Creating Tokens
def embed_text(texts):
    inputs = tokenizer(texts,padding = True, truncation=True, return_tensors="pt")
    with torch.no_grad():
        embeddings =model.roberta(**inputs).last_hidden_state.mean(dim=1)
    return embeddings

from langchain.embeddings.base import Embeddings

#Creating Embedding
class HuggingFaceEmbeddings(Embeddings):
    def embed_documents(self, texts):
        embeddings = embed_text(texts)
        return embeddings.numpy().tolist()  # Convert to list of lists
    def embed_query(self, query):
        embedding = embed_text([query])  # Get the embedding for the query
        return embedding.numpy().flatten().tolist()  # Flatten to 1D list

hr_embeddings = HuggingFaceEmbeddings()

newbd = FAISS.from_documents(document, hr_embeddings)

n=2

retriver_faiss  = newbd.as_retriever(search_kwargs={"k": n})

llm_chain = LLMChain(llm=llm_robert, prompt=prompt,output_parser=output_parser)

qa_chain_faiss = RetrievalQA.from_llm(llm=llm_chain,retriever=retriver_faiss)

qa_chain_faiss.run(query)



#Below is the example of LLM Chain with memory

template = """You are a nice chatbot having a conversation with a human.
Previous conversation:
{chat_history}
New human question: {question}
Response:"""

promptnew = PromptTemplate.from_template(template)

memory = ConversationBufferMemory(memory_key="chat_history")

conversation = LLMChain(
    llm=llm,
    prompt=promptnew,
    verbose=True,
    memory=memory
)

while True:
    user_input = input()
    if user_input =="quit":
        print('It was a great conversation')
        break
    elif user_input=="clear memory":
        print('memory cleaning')
        memory.clear()
    else:
        text = conversation({"question": user_input})
        print(text['text'])

ROC -AUC Guide

Kamal Singh Rathore — Mon, 28 Oct 2024 01:59:34 GMT

Hi everyone, In this article we are going to cover ROC and show how to use it and its implementation.

Let's start with the question of what is ROC and what is its use cases.

ROC stands for Receiver Operating Characteristic Curve. It is an evaluation metric that it traditionally designed for binary class classification problems to evaluate the performance of a classification model. It is a graph between the True Positive Rate and the False positive rate at different threshold values.

Let's Understand what is a True positive rate and what is False positive rate

True positive rate also known as Recall or Sensitivity: This metric is used to predicts. How many positive classes, our model, is able to predict from the actual total positive cases in the data.

False positive Rate: This is a metric that measures the portion of actual negative cases that are classified as postive from the total negative cases.

There is also one other important thing that we need to know, that is “Threshold”.

We can think of threshold as a maximum cutoff limit. If the output probability of a model goes above this, we will classify the output as a true/false case. In machine learning algorithms like Logistic Regression, Random Forest Classifier, SVM etc. we can use this threshold for classifying the output in different classes.

The ROC Curve is generated by varying the threshold from 0 to 1 and calculating the TPR and FPR for each threshold values and plotting the observation on the graph.

The X axis represent FPR and Y axis represent TPR

Let's look at one more important term, that is AUC stands for Area under the ROC Curve. It measures the overall performance of the model by calculating the area under the Roc curve. The greater the area under the curve, the better the model performance.

It ranges from 0 to 1.

AUC = 1 it means model is able to predict all the output values correctly.

AUC = 0.5 it means model is only able to predict 50% of the values correctly. It is not better than a random model.

AUC < 0.5 it is worse than the random classifier

Let's look into the example how to implement it

In this example, we are not trying to achieve a model with high performance, but our focus is in implementing the graph.



train = pd.read_csv("/kaggle/input/binary-classification-bank-churn-dataset-cleaned/train_cleaned.csv")


from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split


X= train[['Gender','Balance','NumOfProducts','IsActiveMember','Geography_France','Geography_Germany','Geography_Spain','Age_bin']].values.tolist()

Y = train[['Exited']].values.tolist()


X_train , X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.7)

lr = LogisticRegression()

lr.fit(X_train,Y_train)

ypred = lr.predict(X_test)


from sklearn.metrics import classification_report

#Classification Report
print(classification_report(Y_test,ypred))

Implementation

from sklearn.metrics import roc_curve,auc,roc_auc_score

import matplotlib.pyplot as plt

y_prob = lr.predict_proba(X_test)[ :, 1]

#getting the FPR, TPR at different Threshold
fpr, tpr, threshold = roc_curve(Y_test,y_prob)

#Calculating Area under the Curve
roc_auc = roc_auc_score(Yte,y_prob)


plt.figure()

#Plotting Roc Curve Lines
plt.plot(fpr, tpr, color='blue', lw=2 , label='ROC Curve (area = %0.2f)' % roc_auc)
plt.plot([0,1],[0,1], color='red',lw=2,linestyle='--')

plt.xlim([0.0,1.0])
plt.ylim([0.0,1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend(loc="lower right")
plt.show()

Area Under the Curve is low, it means we need to do Feature Engineering on the dataset, or we can also try Hyperparameter tuning and try different models.

Similarity Metrics

Kamal Singh Rathore — Wed, 09 Oct 2024 04:02:48 GMT

Hi everyone, We are going to cover most important types of Similarity metric in machine learning with examples.

Similarity Metric is a measure that is used to calculate how similar two data points are in a vector space, where different dimensions represent different attribute of the data, It is based on data points distance from each other, Less the distance from each other more similar those data point will be and vice versa.

It is mostly used in clustering for calculating algorithms for calculating the distance between data points, and it helps in dividing data into different clusters. In NLP, we used it for calculating similarity between vectors in a higher dimension space. In Dimensionality reduction algorithms, we use similarity metrics for replicating the relationship from higher to lower dimension. It is also used in algorithms like KNearsetNeighbour for calculating nearest points from a given point and making prediction based on it.

Note :I have included the examples based on NumPy and sklearn.

Different Similarity Metrics

1: Cosine Similarity :

Cosine similarity is the measure of similarity between two no zero vectors in an inner product space, using cosine angle as the measure of similarity. It is the dot product of the vectors divided by the product of their lengths.

Cos Sim formula

Where A and B are the vectors

Where A.B is the dot product of the vectors

||A|| and ||B|| are the magnitudes of A and B

Examples

import numpy as np # linear algebra
A = [2, 3, 4]
B = [1, 0, 5]

AB = np.dot(A,B)


Anorm = np.linalg.norm(A)

Bnorm = np.linalg.norm(B)

CosSim = round(AB / (Anorm * Bnorm ),3)

print(CosSim)

Output

0.801

from sklearn.metrics.pairwise import cosine_similarity

A = np.array(A).reshape(1,-1)

B = np.array(B).reshape(1,-1)

Cos_sim = cosine_similarity(A,B)

print(round(Cos_sim[0][0],3))

Output

0.801

2: Euclidean Distance Matrix:

It measures the Euclidean distance between two data points in a Euclidean space using the co-ordinates of the data point.

Euclidean Formula

Examples

A = np.array([2, 3, 4])
B = np.array([1, 0, 5])

np_Euclidean = np.linalg.norm(A - B)

print(round(np_Euclidean,3))

Output

3.317

from sklearn.metrics.pairwise import euclidean_distances

A = np.array(A).reshape(1,-1)

B = np.array(B).reshape(1,-1)

sk_euclidean = euclidean_distances(A,B)

print(round(sk_euclidean[0][0],3))

Output

3.317

Array = np.array([[1,2,3],
                 [4,5,6],
                 [2,5,7]])

sk_euclidean = euclidean_distances(Array)

print(sk_euclidean)

Output

[[0. 5.19615242 5.09901951]
[5.19615242 0. 2.23606798]
[5.09901951 2.23606798 0. ]]

3: Manhattan (L1) Distance Matrix :

It measures the absolute distance between the data points in a vector space. It uses grid lines two calculate the distance. It is also called L1 or TaxiCab distance.

Formula

Examples

from sklearn.metrics.pairwise import manhattan_distances

Array = np.array([[1,2,3],
                 [4,5,6],
                 [2,5,7]])

sk_manhattan = manhattan_distances(Array)

print(sk_manhattan)

Output

[[0. 9. 8.]
[9. 0. 3.]
[8. 3. 0.]]

A = np.array([2,3,6])
B = np.array([5,7,8])


np_manhattan = np.sum(np.abs(A-B))

print(np_manhattan)

Output

A = np.array([2,7,6]).reshape(1,-1)
B = np.array([5,7,8]).reshape(1,-1)

sk_manhattan = manhattan_distances(A,B)

print(sk_manhattan[0][0])

Output

5.0

4 : Minkowski Distance :

This metric is a generalized form of both metrics Euclidean Distance and Manhattan Distance.

Formula

Explanation

A and B are vectors

n is the number of dimensions

p it defines the type of distance

if p = 1 it means Manhattan Distance

if p = 2 it means Euclidean Distance

Examples

from sklearn.metrics import pairwise_distances

A = np.array([2,7,6]).reshape(1,-1)
B = np.array([5,7,8]).reshape(1,-1)

sk_minkowski_manh = pairwise_distances(A,B,metric='minkowski',p=1)

sk_minkowski_euc = pairwise_distances(A,B,metric='minkowski',p=2)

print(round(sk_minkowski_manh[0][0],3))

print(round(sk_minkowski_euc[0][0],3))

Output

5.0
3.606

Array = np.array([[1,2,3],
                 [4,5,6],
                 [2,5,7]])

sk_minkowski_manh = pairwise_distances(Array,metric='minkowski',p=1)

sk_minkowski_euc = pairwise_distances(Array,metric='minkowski',p=2)

print(sk_minkowski_manh)

print(sk_minkowski_euc)

Output

[[0. 9. 8.]
[9. 0. 3.]
[8. 3. 0.]]

[[0. 5.19615242 5.09901951]
[5.19615242 0. 2.23606798]
[5.09901951 2.23606798 0. ]]

A = np.array([2,7,6]).reshape(1,-1)
B = np.array([5,7,8]).reshape(1,-1)


def minkowski(A,B,P):
    return np.power(np.sum(np.abs(A-B)**P),1/P)

np_minikowski_manh = minkowski(A,B,1) 


np_minikowski_euc = minkowski(A,B,2) 

print(np_minikowski_manh)

print(round(np_minikowski_euc,3))

Output

5.0
3.606

5 : Hamming Distance :

It is used to measure the distance between two strings of equal length by counting the number of positions the strings differ.

from sklearn.metrics import pairwise_distances
import numpy as np

# Define the array
Array = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [2, 5, 7]])

# Compute pairwise Hamming distance
hamming_distance_sklearn = pairwise_distances(Array, metric='hamming')

print("Hamming Distance Matrix using sklearn:\n", hamming_distance_sklearn)



A = np.array([1,4,5]).reshape(1,-1)
B = np.array([1,2,5]).reshape(1,-1)

hamming_distance_sklearn = pairwise_distances(A,B, metric='hamming')

print("Hamming Distance Matrix using np:\n", round(hamming_distance_sklearn[0][0],3))

Output

Hamming Distance Matrix using sklearn:
[[0. 1. 1. ]
[1. 0. 0.66666667]
[1. 0.66666667 0. ]]

Hamming Distance Matrix using np:
0.333

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -

These are the most important metric.

Clustering in Machine Learning

Kamal Singh Rathore — Fri, 04 Oct 2024 05:54:05 GMT

Hi everyone, Here we are going to cover one of the most important topics in Machine Learning that is Clustering.

Let's start with the question of what is clustering and why do we need it.

In layman terms, if we try to understand clustering, it is the process of grouping data together in a cluster on the bases of one or more common features.

Definition : Clustering is an unsupervised machine learning algorithm. Which is used to form groups of homogeneous data from the dataset of heterogeneous data. This approach is different from Regression and Classification algorithms because here we are not try to predict output value based on input values, we are forming groups based on similarity. We use different distance metrics like Euclidean distance, Cosine similarity, Manhattan distance, etc. for calculating the similarity between different data points. If you want to read more about similarity metric, you can click here

We can use clustering in different task like Customer Segmentation, Social Media Analysis, Recommendation Engine, Market Analysis etc.

Now let's check the different type of Clustering and important methods inside it.

First, we will import the necessary libraries and make a dummy dataset.

from sklearn import datasets

#Dataset
sample = datasets.make_circles(n_samples=900,noise=0.9,random_state=80,shuffle=False)
X = sample[0]
Y = sample[1]

#Necessary imports
from sklearn.cluster import MeanShift, OPTICS, DBSCAN, Birch, KMeans, AgglomerativeClustering
from sklearn.mixture import GaussianMixture

1 : Partition Based Clustering :

In this type of clustering we pass the number of clusters we want, and it uses different similarity metrics like Euclidean distance, Manhattan distance, Cosine Similarity etc. for dividing the data into different clusters.

Methods in Partition based Clustering

A : K-Means :

This method works by measuring the similarity between the data points, We pass the k this is the number of cluster we want, and it is automatically adjusting the data point to different clusters based on its distance from the centroid. This algorithm is sensitive to outliers.

For finding the optimum number of clusters, we can use elbow method. This method works by plotting the variance explained by different number of clusters on a line graph and finding the elbow point of the graph. It works by calculating the WCSS (Within-Cluster Sum of Squares) that is the sum of square distance of data points from the cluster centroid.

Code here

wss = []
for i in range(1,40):
    km = KMeans(n_clusters=i,max_iter=5)
    km.fit(X)
    wss.append(km.inertia_)

plt.figure(figsize=(12,4))
plt.plot(wss)
plt.xlabel('No of clusters')
plt.ylabel('WCSS')
plt.title('Elbow Plot')
plt.show()

WCSS Plot

For this example, we will choose 10 as number of clusters to avoid complexity.

kme = KMeans(n_clusters=10)
y = kme.fit_predict(X)

plt.figure(figsize=(15,6))
for i in range(0,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],color=colors[i],label=f'Cluster {i}')
    
plt.title('KMeans')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

K Mean

Other important Algorithm you can check

B : K-Medoids

2 : Density Based Clustering Algorithm:

This method works by finding the high density reasons in a data and grouping those data points in a cluster. We don’t need to provide the number of clusters in the data. It automatically decides it on its own.

Methods in Density Based Clustering

A : DBSCAN :

It stands for Density Based Spatial Clustering of Application with Noise. It is based on the principal, clusters are the dense regions in a dataset separated by the sparse region because of that clusters are not of similar size. Furthermore, it works well on the dataset having noise values. Elipson and min_samples are the two important parameters in this algorithm.

Code here

dbscan = DBSCAN(eps=0.3,metric='manhattan',min_samples=20)
y = dbscan.fit_predict(X)

plt.figure(figsize=(14,6))
for i in range(-1,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],label=f'Cluster {i}')
    
plt.title('DBSCAN')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

DBSCAN

B : Mean Shift :

This is a non-parametric algorithm. We don’t need to pass the number of clusters we want, it decides on its own. It is also known as Mode Seeking Algorithm. It works by iteratively shifting the Mean of the cluster towards the densest area of the cluster. For deciding the densest area of the clusters, it uses a kernel function.

Code here

Ms = MeanShift(bin_seeding=True,bandwidth=0.8)
y = Ms.fit_predict(X)

plt.figure(figsize=(14,6))
for i in range(0,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],label=f'Cluster {i}')
    
plt.title('Mean Shift')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

Mean Shift

Other important Algorithm you can check

C : OPTICS

opt = OPTICS(min_samples=20,metric="euclidean",xi=0.02,eps=0.5)
y = opt.fit_predict(X)

plt.figure(figsize=(14,6))
for i in range(-1,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],label=f'Cluster {i}')
    
plt.title('OPTICS')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

Optics

3 : Connectivity Based Clustering (Hierarchical Clustering ):

This method forms a tree like structure. Each data point is assumed as a separate cluster on the X axis, which is then joined together based on the similarity with other clusters. It forms a Dendogram.

A : Agglomerative Hierarchical Clustering:

It is a Bottom Up approach here each data point is considered as individual clusters and with each level it joins the cluster based on similarity and finally forms a single cluster at Top

Code

gm = AgglomerativeClustering(n_clusters=4)
y = gm.fit_predict(X)

plt.figure(figsize=(12,5))
for i in range(0,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],color=colors[i],label=f'Cluster {i}')
    
plt.title('Agglomerative Clustering')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

Agg example

B : Divisive Hierarchical Clustering :

This method is opposite of Agglomerative clustering. You can think of this as Top-down approach, it starts with a single cluster and recursively splitting the cluster based on the dissimilarity

Other important Algorithm you can check

C : BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)

bir = Birch(threshold=0.4, branching_factor=50,n_clusters=5)
y = bir.fit_predict(X)

plt.figure(figsize=(12,5))
for i in range(0,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],label=f'Cluster {i}')
    
plt.title('Birch')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

Birch

4 : Distribution Based Clustering :

This clustering approach takes a totally different metric into consideration, that is probability. It considers the probability of a data point belonging to a probability distribution. The higher the distance of the data point from the center point of the cluster, the lesser the chance of data point belong to that cluster.

A : Gaussian Mixture Models (GMM) :

This clustering method assumes data is comprising a Gaussian Distribution. The probability of a data point belonging to a cluster depends on its distance from the center of the cluster, higher the distance lesser the chance, It is a statistical inference clustering technique.

gm = GaussianMixture(n_components=4, covariance_type='diag', random_state=42)
y = gm.fit_predict(X)

plt.figure(figsize=(12,5))
for i in range(0,len(set(y))):
    plt.scatter(X[y==i,0],X[y==i,1],label=f'Cluster {i}')
    
plt.title('Gaussian Mixture')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc="upper right")
plt.show()

Gauss

So we have covered the important clustering methods with examples.

Hope you have ready and liked it.

Leaving Behind Yourself

Kamal Singh Rathore — Sun, 29 Sep 2024 04:11:23 GMT

Hi everyone, Recently I have come across a question “What you are leaving behind for the purpose of growth. Is it worth it, or being where you are is a better option.”

When you think about those situations, where you have the opportunity to move to a new city or a country, there is always a question in mind is it worth it. Am I doing something wrong, do I have to stay here with my family, friends and try to do things within my town where I grew up and know everything about, do i able to survive in the new environment?

The other side is, If you don’t move to a different town or country, you won’t be able to gain the much-needed experience in your life. Being on the road, always moving, makes you feel you are learning something, making new connections and growing. It helps you deal with your weakness, it makes you stronger, it makes you tougher and smarter. You learn to live by yourself, and you get the confidence you needed.

But leaving everything behind is more of an emotional and mental journey because you are not just leaving a city or a place you are leaving behind the memories you have created over the years, you are leaving behind your loved ones. You will be alone in a city of a million people. Even when you feel like talking to someone, you won’t have anyone close to you. There might be days when you are sitting alone for days in your room and feeling like what am I doing.

This choice seems easy for someone who is in their younger age. Hot blood full of passion and energy is moving through your veins, and you decide to move out, you want to experience the freedom of living alone, You want to have a life where no one is controlling you or affecting your decision. But with this freedom comes responsibilities it is not easy to live alone you have to pay the price, and you have to learn and adapt faster. Because in every new city you go to, you will find people who are ready to robe you, take advantage of you.

But when you grow old and started to get the understanding of life and how things work. You get to know the importance of family and friends, because these are not just titles we give to anyone. These are the people who understand us and will be available for us when needed in a critical situation.

We can’t say one choice is better than the other, because if you won’t move to a new place. If you don't leave your past behind and take new challenges, you won’t be able to learn new things. Taking on new challenges in life is important. It will grow your understanding of life, make you stronger and smarter. You learn how to make relationships, and it will improve your understanding of life.

As you start growing older, you will understand. Taking risks is fun but having a stable and peaceful life is more important because whatever we are doing in life the end goal of everyone is to have a fulfilled life and this is only achieved when we have calmness in our life.

This cycle of leaving yourself behind it like the cycle of destruction and regeneration. We only get to learn what we want when we have nothing in life, and leaving yourself behind is part of that self search. So you have to decide when you are ready to take risk in your life and when you are ready to choose a calm path and enjoy everything around you.

Love whatever you have because once it is gone you won’t get it again.

NLP Vectorization and Embedding story

Kamal Singh Rathore — Wed, 25 Sep 2024 05:32:51 GMT

Hi everyone, We are going to cover one of the most important topics in NLP that is Vectorization or you can say Word Embedding. These methods are serving the same purpose, but the techniques that we use are different. We are going to cover the different techniques and show code snippets on how to implement them.

First, Let's start with understanding what is Vectorization and Embeddings

Both Vectorization and Embedding are the two sides of the same coin. In Vectorization, we try to represent the words with numbers. We can consider Vectorization as the Statistical methods for converting words in numbers, it involves simple and easy to understand techniques like Bag of Words, TF-IDF etc.

While embedding is a more advance approach, we try to convert the word into a dense vector to represent the meaning of the words and those words that are similar to each other are closer in vector space. For getting embedding either we can use pretrained embedding like C BOW, Skip-gram etc. or we can get our custom embeddings.

Now we are going to check the code for different techniques that we can use.

The Dataset for the below examples will be

import pandas as pd

# Define some categories
categories = ['Technology', 'Science', 'Business', 'Health', 'Education']

# Create 50 sentences, each under 50 words
sentences = [
    "Artificial intelligence is transforming industries by automating complex processes.",
    "The study of quantum mechanics continues to challenge our understanding of the physical world.",
    "Startups are reshaping the financial industry with innovative fintech solutions.",
    "Nutrition and exercise are key components of maintaining good health and well-being.",
    "Online learning platforms are revolutionizing education, making knowledge accessible to all.",
    "5G technology is expected to significantly enhance mobile network speeds and connectivity.",
    "Advances in genetic engineering could lead to breakthrough treatments for inherited diseases.",
    "E-commerce is growing rapidly as more consumers shop online for convenience." 
]

# Assign a random category to each sentence
data = {'Sentence': sentences, 'Category': [categories[i % len(categories)] for i in range(8)]}

# Create DataFrame
df = pd.DataFrame(data)
df.head()  # Show first few rows of the DataFrame

Original DataFrame

Below are the necessary libraries you have to import

from sklearn.preprocessing import LabelEncoder,OneHotEncoder
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer 
from sklearn.feature_extraction import DictVectorizer
from transformers import BertTokenizer, BertModel
import gensim
from gensim.models import Word2Vec
import torch
from nltk.tokenize import word_tokenize
import tensorflow
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from keras.utils import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN , Embedding

First, we will do the Label Encoding on the Categories column.

lb = LabelEncoder()

df['Category']= lb.fit_transform(df['Category'])
df['original_name'] = lb.inverse_transform(df['Category'])
df['Token'] = df['Sentence'].apply(word_tokenize)
df.head(5)

First Modification

1 : CountVectorizer : In Count Vectorization, first we create a corpus of all the words that appears in a document, after that we convert them into vector using that corpus on the bases of if they appear in a sentence or not.

cv = CountVectorizer()
dcv = cv.fit_transform(df['Sentence'])
#print(cv.vocabulary_)
new = pd.DataFrame(dcv.toarray(),columns= cv.get_feature_names_out())
final = pd.concat([new,df],axis=1)
final

Count Vectorize

2 : Ngram : You can think of this method as an updated version of Count Vectorization, In this method we have the flexibility to either take words as individual tokens or tuple of words of 1 to n size.

bgw = CountVectorizer(ngram_range= (1,3))
bgwv = bgw.fit_transform(df['Sentence'])
#print(cv.vocabulary_)
new = pd.DataFrame(bgwv.toarray(),columns= bgw.get_feature_names_out())
final = pd.concat([new,df],axis=1)
final

Ngrams

3 : TF IDF : In this approach, we try to find the importance of a word in a document and the corpus of the documents. According to that, we assign some numerical representation to it.

tf = TfidfVectorizer()
tfv = tf.fit_transform(df['Sentence'])
#print(tf.vocabulary_)
new = pd.DataFrame(tfv.toarray(),columns= tf.get_feature_names_out())
final = pd.concat([new,df],axis=1)
final

TF IDF

4 : One Hot Encoding : This method is used for transformation of categorical data to numerical data. It creates binary columns (or vectors) for each unique category, with a 1 indicating the presence of a category and 0 for all other categories.

ohe = OneHotEncoder(handle_unknown='ignore')
ohetok = ohe.fit_transform(np.array(df['Sentence']).reshape(-1,1))
new = pd.DataFrame(ohetok.toarray())
final = pd.concat([new,df],axis=1)
final

OHE

As we have covered, the important methods in Vectorization, let's check for methods in embedding.

5 : C Bow (Continuous Bag Of Words) : The underlying process of this method is that we feed our neural network model with a stream of input words, and it tries to predict an output word that is closely related to the steam of input words we feed to the neural network.

sentence = df['Token'].tolist()

cbow = Word2Vec (vector_size=50,sg=0)
cbow.build_vocab(sentences)
cbow.train(sentences,epochs=5,total_examples=cbow.corpus_count)

def getembed(text_embd):
    
    word_embd = [ word for word in text_embd if word in cbow.wv.index_to_key]
    
    if len(word_embd) > 0:
        return cbow.wv[word_embd].mean(axis=0)
    else:
        return None

df['cbow_emb'] = df['Token'].apply(getembed)

CBOW

6 : Skip gram : In this approach, we are trying to predict a continuous stream of words based on a given input words. We can think of this approach as the opposite of C Bow.

sentence = df['Token'].tolist()

skp = Word2Vec(sentence,vector_size=60,sg=1)
skp.build_vocab(sentence)
skp.train(sentence,epochs=5,total_examples=skp.corpus_count)

def skg_emb(tok):
    skg_emb = [word for word in tok if word in skp.wv.index_to_key]
    
    if len(skg_emb)>0:
        return skp.wv[skg_emb].mean(axis=0)
    else:
        return None

df['skg_emb'] = df['Token'].apply(skg_emb)

skg

While using these methods, we can either train our model or we can also use pretrained models for it.

import gensim.downloader as api

model = api.load('word2vec-google-news-300') 

def pre_emb(tok):
    pre_emb = [word for word in tok if word in model.index_to_key]
    
    if len(pre_emb)>0:
        return model[pre_emb].mean(axis=0)
    else:
        return None

df['Pre_emb'] = df['Token'].apply(pre_emb)

7 : Bert (Bidirectional Encoder Representation of Transformer) : This is a Transformer based model that uses the Encoder part of the Transformer, and it is introduced by the Google.

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

def bert_embd(text):
    token = tokenizer(text,return_tensors="pt",truncation=True,padding=True)
    output = model(**token)
    
    embed = torch.mean(output.last_hidden_state, dim=1)
    return embed.detach().numpy().flatten()

df['bert_embed'] = df['Sentence'].apply(bert_embd)

BERT

We can also use Deep Learning library like TensorFlow and Keras for it.

token = Tokenizer(oov_token='')

token.fit_on_texts(df['Sentence'])

seq = token.texts_to_sequences(df['Sentence'])

output = pad_sequences(seq,maxlen=10,padding='post')

len(token.word_counts)

x_reshaped = output.reshape((output.shape[0], output.shape[1])) 

encoder = OneHotEncoder(sparse=False)
y_encoded = encoder.fit_transform(df[['Category']]) 


#Model 1 

model = Sequential()
model.add(SimpleRNN(32,activation='relu',input_shape=(10,1)))
model.add(Dense(5,activation='relu'))
model.add(Dense(y_encoded.shape[1], activation='softmax')) 
model.compile(loss='categorical_crossentropy',optimizer='adam')
model.fit(x_reshaped,y_encoded, epochs=10, batch_size=1)
model.summary()



#Model 2


model = Sequential()
model.add(Embedding(input_dim=78, output_dim=32, input_length=10)) 
model.add(SimpleRNN(16,activation='relu'))
model.add(Dense(5,activation='relu'))
model.add(Dense(y_encoded.shape[1], activation='softmax')) 
model.compile(loss='categorical_crossentropy',optimizer='adam')
model.fit(x_reshaped,y_encoded, epochs=10, batch_size=1)
model.summary()

We have covered all the important techniques. Thanks for reading

10 Ways To Get out of Doom Scrolling

Kamal Singh Rathore — Sun, 22 Sep 2024 03:34:11 GMT

Hi everyone, you’ve probably heard of or experienced doom-scrolling. It’s one of those “gifts” of our interconnected world that we didn’t ask for, yet somehow got stuck with. Today, I’m going to share some personal experiences and methods that have helped me break free from doom-scrolling.

What is Doom Scrolling and Its Effects on Your Mind and Health?

Doom-scrolling is the endless scrolling through social media apps like Facebook, Instagram, Twitter, Snapchat etc. It affects both your brain and overall physical health because, when you’re addicted to these apps, you scroll endlessly without any real purpose.

Doom eye

If you spend too much time doom-scrolling, it can have a negative impact on your emotional and physical health. You start feeling disconnected from reality and may use the virtual world as an escape from your real-life problems. This not only dulls your sensitivity toward real-life social interactions but can also leave you feeling like you’re not achieving anything in your life.

There might be days when you think, “What am I doing with my life? What have I accomplished? Am I worth anything?” You may find yourself lying in bed for hours, endlessly scrolling through your social media feed, while life passes you by. Over time, this has a serious negative impact on your mental health, making you feel worthless, and it also harms your physical health by promoting a sedentary lifestyle.

If this habit continues unchecked, it can trigger a chain reaction of negativity, further isolating you from the real world, cutting you off from meaningful social interactions, and even causing you to withdraw from society altogether.

10 Methods to Stop Doom Scrolling:

Here are some effective hacks that helped me break free from the cycle of doom-scrolling:

No-Touch Morning:
The rule here is simple: When you wake up in the morning, don’t touch your phone for the first two hours. Instead, engage in activities like going for a jog or walk, reading a book, doing yoga, or exercising. This helps you start the day on a positive note, making you feel energized and motivated. The key is to avoid lounging afterward — don’t sink into your couch or bed and start scrolling again. To prevent this, move on to the next hack.
Create a Checklist:
Once your morning routine is done, create a daily checklist of activities. The tasks don’t have to be completed in a strict order, but you should aim to finish them all by the end of the day. Your list can include anything — reading a chapter of a book, meeting a friend, cooking a new dish, working on a hobby, exercising, or engaging in sports. This keeps your day structured and gives you a sense of accomplishment, reducing your urge to mindlessly scroll.
Monitor Your Phone Usage:
Use screen time monitoring apps to track how much time you’re spending on social media. If you go over a certain limit, these apps will block access to those platforms for the rest of the day. This simple strategy can help you become more aware of your scrolling habits and set boundaries for healthier usage.
Self-Monitoring:
This is about consciously observing the decisions you’re making throughout the day. Pay attention to the triggers that lead you to pick up your phone and start scrolling. For example, I noticed that on days when I start my morning with doom scrolling, I usually end up spending most of the day on my phone. By recognizing your own triggers, you can take steps to avoid them.
Social Media Detox:
If none of the above works, consider going for a full social media detox. Remove all social media apps from your phone, at least temporarily. This is a drastic measure, and it might make you feel anxious or experience cravings for social media, but if you can hold out for a few days, it can help you break the habit. However, this may not be for everyone, especially if your work depends on social media.
Adopt a New Hobby:
Learning something new can help distract you from the urge to scroll and provide a healthy outlet for your energy. Whether it’s painting, writing, playing sports, hiking, or learning a musical instrument, hobbies can help you reconnect with the real world and give you a sense of fulfillment.
Set Boundaries for Social Media:
You can establish a rule of using social media only at specific times of the day, say for 30 minutes in the evening. This way, you don’t completely cut it out but limit your exposure. Turn off notifications to avoid being constantly pulled back in.
Engage in Real-Life Social Interactions:
Make an effort to spend time with people in person — catch up with friends, go for a walk with family, or participate in group activities. Real-life social connections help break the habit of constantly seeking validation and interaction online.
Use the Pomodoro Technique:
Another helpful method is the Pomodoro technique, where you focus on a task for 25 minutes, then take a 5-minute break. This structure helps you stay productive and minimizes the time spent on distractions like social media.
Practice Mindfulness and Meditation:
Sometimes, the urge to doom scroll comes from a desire to escape feelings of stress or anxiety. Practicing mindfulness or meditation can help you become more present and aware of your surroundings, reducing the compulsion to check your phone out of boredom or anxiety.

These are the strategies that have been most effective for me in combating doom-scrolling. If you’re looking to break this habit, start small, and be patient with yourself. Let’s put our phones down, look out at the sky, and reconnect with the world around us!

Creating First Amazon EC2 instance

Kamal Singh Rathore — Thu, 19 Sep 2024 02:46:50 GMT

Hi everyone, In this article we are going to cover Amazon Web Services, Amazon EC2 and I will provide a step-by-step guide for creating an Amazon EC2 instance.

Let's start first with the Question of What is Cloud and Cloud Computing.

Cloud is the global network of servers interconnects to provide seamless services to the users. It is not a physical entity, instead it is a vast ecosystem of remote servers around the globe which are hooked together and operates as a single ecosystem.

Now, as we have a clear picture about what is Cloud, let's look into Cloud Computing.

These are the platform which provides on demand services to the users. They offer variety of services which includes cloud storage, availability of computer system resources, computing power, Servers, Networking, Analytics etc. The most famous Cloud platforms at present are AWS, Microsoft Azure and GCP.

Photo by spiceworks.com

Now the Question is what is AWS and what services it offers.

Amazon Web Services (AWS) is a service on demand platform introduced by Amazon. It works on pay-as-per-use policy, where you only have to pay for the resources you are using. We can customize the services according to our needs.

Top services offered by AWS.

1. Amazon EC2 (Elastic Cloud Compute)
2. Amazon RDS (Relational Database Services)
3. Amazon S3 (Simple Storage Service)
4. Amazon EBS (Elastic Block Store)
5. Amazon Lambda
6. Amazon CloudFront
7. Amazon SNS (Simple Notification Service)
8. Amazon VPC (Virtual Private Cloud)
9. Amazon Auto-Scaling
10. Amazon Elastic Beanstalk

Now, we have gone through the basic. Let's understand EC2 and create a EC2 instance.

EC2 Intro

Amazon Elastic Compute Cloud eliminates your need to invest in hardware up front. They provide scalable computing capacity ranging from latest processors, operating system, Storage, Networking also a purchase model to scale the system requirements in future to help to better manage your workload.

Let's start with the demo, we will go with the free tier configuration.

Hope you have an AWS account, if not, please create one and search for EC2. This will open E2 Dashboard and click on Launch instance at the bottom to launch a new instance, or you can click on Instances(running) to check the active instances.

EC2 home page

It will open the EC2 instance Launch page. Name your instance as shown below.

Name your instance

Now choose your Amazon Machine, either you can go with free tier or browse and select other AMI according to your requirement.

We will select the instance type now. You can choose free tier or paid one according to your needs.

Instance Type

We will select a existing Key pair or download a new Key pair. These Key are helpful for connecting to the instance.

Key pair option

Below is the prompt for creating a New Key. Set the name for your key and choose key format after that click on Create Key Pair. It will download a file that will help you to connect with AMI from your device.

After that, you have to update the Network settings. Choose the security group and which type of network request you want to allow also you can choose default IP or Custom IP.

Network setting

You are all set to launch the instance. Click on Lunch Instance Button on bottom right. Now you will have your instance running and check it on instance page.

instance page

We are all set to connect to the instance. Click on instance and it will open a instance detail page. At the top right you will find the connect option. Click on it.

Connect option

From EC2 instance connect, you can directly connect to the AMI with the connect button at the bottom.

Connect page

If you want to connect from your device, Go to the SSH Client option and there you will find a command like below given one.

ssh -i “keytest.pem” ec2-user@ec2–xx–xxx–xx–xxx.us-west-2.compute.amazonaws.com

You have to copy that command and go to your command prompt. Now go inside the folder where you have saved your key that we have downloaded and run your command.

We have created the instance and launched it. Now let's check how to delete it.

Go to your instance page and select the instance you have to terminate. Check below image for reference.

Now you are all set to go and create your first EC2 instance.

All the best!!