<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Kamal Singh Rathore on Medium]]></title>
        <description><![CDATA[Stories by Kamal Singh Rathore on Medium]]></description>
        <link>https://medium.com/@samarrathore482?source=rss-b94ca56679f2------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/0*GTFIbMZLx-r2bQHe</url>
            <title>Stories by Kamal Singh Rathore on Medium</title>
            <link>https://medium.com/@samarrathore482?source=rss-b94ca56679f2------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Fri, 22 May 2026 18:48:57 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@samarrathore482/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Quantum Computing : Part 1 Understanding classical system]]></title>
            <link>https://medium.com/@samarrathore482/quantum-computing-part-1-understanding-classical-system-159ff2035b55?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/159ff2035b55</guid>
            <category><![CDATA[classical-state]]></category>
            <category><![CDATA[probability-vector]]></category>
            <category><![CDATA[quantum-computing]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Wed, 11 Feb 2026 14:27:29 GMT</pubDate>
            <atom:updated>2026-02-11T14:27:29.728Z</atom:updated>
            <content:encoded><![CDATA[<h3>Quantum Basics</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*4jQHBcTWTo5s-i-4Wg6PXA.png" /></figure><h3>Classical States and Probability Vectors</h3><p>Classical state describes the exact condition of a system, and it is definite and predictable.</p><p>Let’s take an example of the classical state. Suppose the system is X and use the symbol Σ to refer to the set of classical states of X. Σ is finite and non-empty.</p><p>Classical states are represented by probability distributions whose values are always finite and non-negative, whereas quantum states are described by complex probability amplitudes and only their squared magnitudes correspond to non-negative classical probabilities.</p><h3>Example of Classical State</h3><p>If X is a bit, then Σ = {0, 1}</p><p>If X is a six-sided die then Σ = {1, 2, 3, 4, 5, 6}</p><p>Often in information processing, our knowledge is uncertain. To represent these uncertainties, we can associate probabilities with different classical states, resulting in what we shall call a probabilistic state.</p><p>For example, X is a bit and based on experience, we assume the probability of X being in state 0 is 3/4 and in 1 is 1/4.</p><h3>Probability Representation</h3><p>Pr(X = 0) = 3/4</p><p>Pr(X = 1) = 1/4</p><p>A more succinct way to represent the probabilistic state is by a column vector:</p><p>[[3/4], [1/4]]</p><p>Any probabilistic state can be represented through a column vector satisfying:</p><p>1. All entries of the vector are nonnegative real numbers.</p><p>2. The sum of the entries is equal to 1.</p><p>Vectors of this form are called probability vectors.</p><h3>Measuring Probabilistic States</h3><p>By measuring a system, we simply mean that we look at the system and recognize whichever classical state it is in without ambiguity. Intuitively speaking, we can’t see a probabilistic state of a system; when we look at it, we just see one of the possible classical states.</p><p>If we recognize that X is in the classical state a ∈ Σ, then the new probability vector representing our knowledge of the state of X becomes the vector having a 1 corresponding to a and 0 for all other entries. This vector indicates that X is in the classical state a with certainty, and we denote this vector by |a⟩, which is read as ‘ket a’.</p><p>|0⟩ = [[1], [0]]</p><p>|1⟩ = [[0], [1]]</p><p>[[3/4], [1/4]] = 3/4 |0⟩ + 1/4 |1⟩</p><p>The probabilistic states describe knowledge or belief, not necessarily something actual, and measuring merely changes our knowledge and not the system itself.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=159ff2035b55" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Joins in SQL]]></title>
            <link>https://medium.com/@samarrathore482/joins-in-sql-7573dad6768e?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/7573dad6768e</guid>
            <category><![CDATA[sql]]></category>
            <category><![CDATA[join]]></category>
            <category><![CDATA[mssql]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Sun, 17 Nov 2024 06:57:07 GMT</pubDate>
            <atom:updated>2024-11-17T06:57:07.004Z</atom:updated>
            <content:encoded><![CDATA[<p>In this article we are going to cover one of the most important concept in SQL that is <strong>“Joins”</strong></p><p>You can think of <strong>joins</strong> as building blocks for fetching data from multiple tables based on some common column values and condition (conditions are optional).</p><p>We will go through different types of joins from basic to advance.</p><p>For explaining the <strong>joins</strong>, I am are going to use two tables, one for those students who are enrolled in Math course and other who have Enrolled in Arts Course. Also, we have provided Employee data for explaining Self and Cross Join</p><p><strong>We have table structure like below and Added Venn Diagram for better understanding.</strong></p><pre>CREATE TABLE MathsEnrollments (<br>    StudentID INT PRIMARY KEY,<br>    Name VARCHAR(100),<br>    EnrolledDate DATE<br>);<br><br><br>CREATE TABLE ArtsEnrollments (<br>    StudentID INT PRIMARY KEY,<br>    Name VARCHAR(100),<br>    EnrolledDate DATE<br>);<br><br><br>INSERT INTO MathsEnrollments (StudentID, Name, EnrolledDate) VALUES<br>(1, &#39;Alice&#39;, &#39;2024-01-10&#39;),<br>(2, &#39;Bob&#39;, &#39;2024-01-12&#39;),<br>(3, &#39;Charlie&#39;, &#39;2024-01-15&#39;),<br>(4, &#39;David&#39;, &#39;2024-01-18&#39;),<br>(5, &#39;Eva&#39;, &#39;2024-01-20&#39;),<br>(6, &#39;Frank&#39;, &#39;2024-01-22&#39;),<br>(7, &#39;Grace&#39;, &#39;2024-01-25&#39;),<br>(8, &#39;Helen&#39;, &#39;2024-01-28&#39;),<br>(9, &#39;Ian&#39;, &#39;2024-01-30&#39;),<br>(10, &#39;Jack&#39;, &#39;2024-02-02&#39;);<br><br><br>INSERT INTO ArtsEnrollments (StudentID, Name, EnrolledDate) VALUES<br>(3, &#39;Charlie&#39;, &#39;2024-01-15&#39;),<br>(5, &#39;Eva&#39;, &#39;2024-01-20&#39;),<br>(7, &#39;Grace&#39;, &#39;2024-01-25&#39;),<br>(8, &#39;Helen&#39;, &#39;2024-01-28&#39;),<br>(9, &#39;Ian&#39;, &#39;2024-01-30&#39;),<br>(11, &#39;Karen&#39;, &#39;2024-02-05&#39;),<br>(12, &#39;Liam&#39;, &#39;2024-02-07&#39;),<br>(13, &#39;Mia&#39;, &#39;2024-02-10&#39;),<br>(14, &#39;Nathan&#39;, &#39;2024-02-12&#39;),<br>(15, &#39;Olivia&#39;, &#39;2024-02-15&#39;);<br><br><br><br># THIS IS FOR SELF JOIN AND CROSS JOIN<br><br>CREATE TABLE Employees (<br>    EmployeeID INT PRIMARY KEY,<br>    Name VARCHAR(50),<br>    ManagerID INT,<br>    Department VARCHAR(50)<br>);<br><br>INSERT INTO Employees (EmployeeID, Name, ManagerID, Department) VALUES<br>(1, &#39;Alice&#39;, NULL, &#39;HR&#39;),<br>(2, &#39;Bob&#39;, 1, &#39;IT&#39;),<br>(3, &#39;Charlie&#39;, 1, &#39;IT&#39;),<br>(4, &#39;David&#39;, 2, &#39;Finance&#39;),<br>(5, &#39;Eve&#39;, 2, &#39;Finance&#39;);</pre><h4><strong>Left Join</strong></h4><p>This type of join return all record from the first table and only the matching records from the second table.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/598/1*MsT3WsMfJ6OBNi9VyY5Ytw.png" /></figure><pre>SELECT * FROM MathsEnrollments M<br>LEFT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID</pre><h4><strong>Right Join</strong></h4><p>This type of join returns all the records from the second table and only matching records from the first table</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/598/1*COVM2hVHxZ0dH77sVs0ONQ.png" /></figure><pre>SELECT * FROM MathsEnrollments M<br>RIGHT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID</pre><h4><strong>Inner Join</strong></h4><p>This type of join only returns the matching records from both the table</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/598/1*Ed0oNYwbfrWyN-mf24k2hQ.png" /></figure><pre>SELECT * FROM MathsEnrollments M<br>INNER JOIN ArtsEnrollments A ON M.StudentID=A.StudentID</pre><h4><strong>Full join</strong></h4><p>This type of join returns all the records from both the table</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/598/1*tUUaypaXMqwsZjG5PHnUlQ.png" /></figure><pre>SELECT *  FROM MathsEnrollments M<br>FULL JOIN ArtsEnrollments A ON M.StudentID=A.StudentID</pre><h4><strong>Cross join</strong></h4><p>In this type of join, all the records from the table A are matched with all the records in table B.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/820/1*gisvhZCMsxQFGltb9vBo1Q.png" /></figure><pre>SELECT E1.Name AS EmployeeName, E2.Name AS AnotherEmployee<br>FROM Employees E1<br>CROSS JOIN Employees E2;</pre><h4><strong>Self join</strong></h4><p>When we join a table with itself, this is known as self join</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/811/1*-tIfg1lNnIW9e050QceqZw.png" /></figure><pre>SELECT E1.Name AS Employee, E2.Name AS Manager<br>FROM Employees E1<br>LEFT JOIN Employees E2<br>ON E1.ManagerID = E2.EmployeeID;</pre><h4><strong>Below are some other variations of joins in SQL</strong></h4><h4><strong>Left join excluding inner join</strong></h4><p>This is a variation of left join, in this join we return only the unique records from the first table</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/552/1*AzgJHfLhtb4xhcnoxciBUg.png" /></figure><pre>SELECT * FROM MathsEnrollments M<br>LEFT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID<br>WHERE A.StudentID is Null</pre><h4><strong>Right join excluding inner join</strong></h4><p>This is a variation of right join, we return only the unique records from the second table</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/552/1*A3LMSLMN8HPyka2bpMdzUg.png" /></figure><pre>SELECT * FROM MathsEnrollments M<br>RIGHT JOIN ArtsEnrollments A ON M.StudentID=A.StudentID<br>where M.StudentID is Null</pre><h4><strong>Full outer join, excluding inner join</strong></h4><p>In this type of join, we return only the unique records from both the table and ignore the common records</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/552/1*fu5-91gpN9aGHn6VbZOfFQ.png" /></figure><pre>SELECT *  FROM MathsEnrollments M<br>FULL JOIN ArtsEnrollments A ON M.StudentID=A.StudentID<br>WHERE A.StudentID is Null OR  M.StudentID is Null</pre><p>You have to try these examples to understand it better. Hope you guys have liked it.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=7573dad6768e" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Langchain]]></title>
            <link>https://medium.com/@samarrathore482/langchain-085f955a81ef?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/085f955a81ef</guid>
            <category><![CDATA[langchain]]></category>
            <category><![CDATA[llm]]></category>
            <category><![CDATA[hugging-face]]></category>
            <category><![CDATA[ai]]></category>
            <category><![CDATA[vector-database]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Tue, 12 Nov 2024 13:09:56 GMT</pubDate>
            <atom:updated>2024-11-27T13:42:24.962Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, In this article we are going to cover LangChain and important concepts in LangChain.</p><p>Let’s start with the most basic question, what is langchain and why do we need it</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/960/1*0GutktcV1PLqUGMdTn52rw.png" /></figure><p>LangChain is one of the most recent innovation in the field of AI, it was first introduced in October 2022 by Harrison Chase. It is an open source framework for the development of application using LLM (Large Language Models). It serves as a generic interface for nearly any LLM model, we can easily build LLM application and integrate them with external software workflow and data sources.</p><p>Langchain have different components that simplify the application development process for the developer by encapsulating the one or more constituent steps into one component.</p><p>In langchain we have different components that help us to work with LLM’s. Below is the list of components with examples.</p><h3>Prompt Templates</h3><p>Prompts are the guiding force that helps LLM models to generate the output in a particular manner. You can think of them as a set of instruction that helps our model to generate output in the desired format, or helps our model to understand in which format we want output to be. Prompt Template class in Langchain helps us to create prompts for the LLM model without the need to hard code it.</p><h3>Chains</h3><p>As the name suggest, it helps to chain together the different functionalities of langchain. If we give you a simple example, suppose you have created a prompt, and you want to run it against a LLM model. You can use Chains to run it. Through Chains we can pass the output of one model as an input for the same or different model depending on the need. In chains we have options like LLMChain, SimpleSequentialChain etc</p><h3>Indexes</h3><p>LLM needs access to external data for certain domain specific task. These external data source can be anything a PDF, CSV file, database etc. In LangChain it collectively refers to these data sources as ‘indexes’.</p><p>We can process this data using text splitter and store it in vector database, also we can retrieve data from the vector database according to our need.</p><h3>Memory</h3><p>LLM does not have memory. It can’t remember the past conversation done by the user. To deal with this problem. LangChain provides the memory functionality, it helps the model to remember the context of the conversation. We can also choose if we want to retain the entire conversation, or we just want to keep a summary of it.</p><h3>Agents</h3><p>Agents are the reasoning engine which can automatically choose which actions to take. Its concept is similar to Chain, in which we have a set of task that the Model performs before generating output. The difference here is We use models as a reasoning engine that decides on its own which task to be performed and in which order.</p><h3>Tools</h3><p>Tools are the interfaces that an agent or chain can use to interact with the real word entities in order to expand to improve its knowledge base. With tools, we can do many things like internet search, perform mathematical operations etc.</p><p><strong>Below is an example that uses LLM model from Hugging Face, Prompt Template, Memory and vector Database.</strong></p><pre>from getpass import getpass<br><br>HUGGINGFACE_API_TOKEN = getpass()<br>import os<br><br>os.environ[&quot;HUGGINGFACEHUB_API_TOKEN&quot;] = HUGGINGFACE_API_TOKEN<br><br>#Importing Necessary libraries<br><br>from langchain_community.document_loaders import WebBaseLoader<br>from langchain_text_splitters.character import RecursiveCharacterTextSplitter<br>from langchain_core.prompts import PromptTemplate<br>from langchain.chains import RetrievalQA,LLMChain<br>from langchain_chroma import Chroma<br>from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings<br>from langchain_huggingface.llms import HuggingFacePipeline<br>from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline<br>from langchain.memory import ConversationBufferMemory<br>from langchain_huggingface import HuggingFaceEndpoint<br>from langchain_community.vectorstores import FAISS<br>import faiss<br>from langchain.memory import ConversationBufferMemory<br><br><br>#Using Web Base loader to get information from the Wiki Website<br>loader = WebBaseLoader(web_path=&quot;https://en.wikipedia.org/wiki/Tata_Motors&quot; )<br><br>docs = []<br><br>doc_lazy = loader.lazy_load()<br><br>for doc in doc_lazy:<br>    docs.append(doc)<br><br>import re<br><br>pattern = re.compile(&#39;\n+&#39;)<br>patterntab = re.compile(&#39;\t+&#39;)<br><br>docs[0].page_content = re.sub(pattern, &#39; &#39;,docs[0].page_content)<br>docs[0].page_content = re.sub(patterntab, &#39; &#39;,docs[0].page_content)<br><br>#Using Character Splitter<br>splitter = RecursiveCharacterTextSplitter(<br>    separators=[&#39;.&#39;],<br>    chunk_size=400,<br>    chunk_overlap=50,<br>)<br><br>document = splitter.split_documents(docs)<br>len(document[0].page_content)<br><br>#Making first model <br>model = &quot;sentence-transformers/all-mpnet-base-v2&quot;<br>embeddings = HuggingFaceInferenceAPIEmbeddings(<br>    api_key=HUGGINGFACE_API_TOKEN, model_name=model )<br>prompt = PromptTemplate(<br>    template=&quot;You are a Chatbot expert in Question Answering. answer the given question {human_input}&quot;, input_variables=[&quot;human_input&quot;])<br><br>repo_id = &quot;mistralai/Mistral-7B-Instruct-v0.2&quot;<br><br>llm = HuggingFaceEndpoint(<br>    repo_id=repo_id,<br>    max_length=128,<br>    temperature=0.5,<br>    huggingfacehub_api_token=HUGGINGFACE_API_TOKEN,<br>)<br><br>#Using Chroma Vector Database<br>database = Chroma.from_documents(document, embeddings)<br><br>retriver  = database.as_retriever(search_type=&quot;similarity&quot;, search_kwargs={&quot;k&quot;: 4})<br><br>from langchain_core.output_parsers import StrOutputParser<br><br>output_parser = StrOutputParser()<br><br>chain = prompt | llm | output_parser<br><br>qa_chain_chroma = RetrievalQA.from_llm(llm=chain,retriever=retriver)<br><br>human_input= &#39;Where is the Headqauter of Tata motor&#39;<br><br>query= &#39;Where is the Headqauter of Tata motor&#39;<br><br>qa_chain_chroma.run(human_input)<br>try:<br>    result = qa_chain_chroma.run(query)<br>    print(result)<br>except Exception as e:<br>    print(f&quot;Error occurred: {e}&quot;)<br><br>#Uisng seocnd llm model and using model form huggingface<br>from transformers import AutoTokenizer, AutoModelForCausalLM<br><br>tokenizer = AutoTokenizer.from_pretrained(&quot;deepset/roberta-large-squad2&quot;)<br><br>model = AutoModelForCausalLM.from_pretrained(&quot;deepset/roberta-large-squad2&quot;)<br><br>repo_id = &quot;deepset/roberta-large-squad2&quot;<br><br>llm_robert = HuggingFaceEndpoint(<br>    repo_id=repo_id,<br>    max_length=128,<br>    temperature=0.5,<br>    huggingfacehub_api_token=HUGGINGFACE_API_TOKEN,<br>)<br><br>import torch<br><br>#Creating Tokens<br>def embed_text(texts):<br>    inputs = tokenizer(texts,padding = True, truncation=True, return_tensors=&quot;pt&quot;)<br>    with torch.no_grad():<br>        embeddings =model.roberta(**inputs).last_hidden_state.mean(dim=1)<br>    return embeddings<br><br>from langchain.embeddings.base import Embeddings<br><br>#Creating Embedding<br>class HuggingFaceEmbeddings(Embeddings):<br>    def embed_documents(self, texts):<br>        embeddings = embed_text(texts)<br>        return embeddings.numpy().tolist()  # Convert to list of lists<br>    def embed_query(self, query):<br>        embedding = embed_text([query])  # Get the embedding for the query<br>        return embedding.numpy().flatten().tolist()  # Flatten to 1D list<br><br>hr_embeddings = HuggingFaceEmbeddings()<br><br>newbd = FAISS.from_documents(document, hr_embeddings)<br><br>n=2<br><br>retriver_faiss  = newbd.as_retriever(search_kwargs={&quot;k&quot;: n})<br><br>llm_chain = LLMChain(llm=llm_robert, prompt=prompt,output_parser=output_parser)<br><br>qa_chain_faiss = RetrievalQA.from_llm(llm=llm_chain,retriever=retriver_faiss)<br><br>qa_chain_faiss.run(query)<br><br><br><br>#Below is the example of LLM Chain with memory<br><br>template = &quot;&quot;&quot;You are a nice chatbot having a conversation with a human.<br>Previous conversation:<br>{chat_history}<br>New human question: {question}<br>Response:&quot;&quot;&quot;<br><br>promptnew = PromptTemplate.from_template(template)<br><br>memory = ConversationBufferMemory(memory_key=&quot;chat_history&quot;)<br><br>conversation = LLMChain(<br>    llm=llm,<br>    prompt=promptnew,<br>    verbose=True,<br>    memory=memory<br>)<br><br>while True:<br>    user_input = input()<br>    if user_input ==&quot;quit&quot;:<br>        print(&#39;It was a great conversation&#39;)<br>        break<br>    elif user_input==&quot;clear memory&quot;:<br>        print(&#39;memory cleaning&#39;)<br>        memory.clear()<br>    else:<br>        text = conversation({&quot;question&quot;: user_input})<br>        print(text[&#39;text&#39;])</pre><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=085f955a81ef" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[ROC -AUC Guide]]></title>
            <link>https://medium.com/@samarrathore482/roc-auc-guide-6614177a39b0?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/6614177a39b0</guid>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[roc]]></category>
            <category><![CDATA[classification-metrics]]></category>
            <category><![CDATA[classification-models]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Mon, 28 Oct 2024 01:59:34 GMT</pubDate>
            <atom:updated>2024-10-28T01:59:34.622Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, In this article we are going to cover ROC and show how to use it and its implementation.</p><p><strong>Let&#39;s start with the question of what is ROC and what is its use cases.</strong></p><p><strong>ROC stands for Receiver Operating Characteristic Curve</strong>. It is an evaluation metric that it traditionally designed for binary class classification problems to evaluate the performance of a classification model. <strong>It is a graph between the True Positive Rate and the False positive rate at different threshold values.</strong></p><p>Let&#39;s Understand what is a True positive rate and what is False positive rate</p><p><strong>True positive rate also known as Recall or Sensitivity</strong>: This metric is used to predicts. How many positive classes, our model, is able to predict from the actual total positive cases in the data.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*iEn6ATOOiA0SgKxmLm55BA.png" /></figure><p><strong>False positive Rate</strong>: This is a metric that measures the portion of actual negative cases that are classified as postive from the total negative cases.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*jCPHvSNfu04d1xQOImnJ2w.png" /></figure><p><strong>There is also one other important thing that we need to know, that is “Threshold”.</strong></p><p>We can think of threshold as a maximum cutoff limit. If the output probability of a model goes above this, we will classify the output as a true/false case. In machine learning algorithms like Logistic Regression, Random Forest Classifier, SVM etc. we can use this threshold for classifying the output in different classes.</p><p>The ROC Curve is generated by varying the threshold from 0 to 1 and calculating the TPR and FPR for each threshold values and plotting the observation on the graph.</p><p><strong>The X axis represent FPR and Y axis represent TPR</strong></p><p>Let&#39;s look at one more important term, that is AUC stands for <strong>Area under the ROC Curve</strong>. It measures the overall performance of the model by calculating the area under the Roc curve. The greater the area under the curve, the better the model performance.</p><p><strong>It ranges from 0 to 1.</strong></p><p><strong>AUC = 1 it means model is able to predict all the output values correctly.</strong></p><p><strong>AUC = 0.5 it means model is only able to predict 50% of the values correctly. It is not better than a random model.</strong></p><p><strong>AUC &lt; 0.5 it is worse than the random classifier</strong></p><p>Let&#39;s look into the example how to implement it</p><p>In this example, we are not trying to achieve a model with high performance, but our focus is in implementing the graph.</p><pre><br><br>train = pd.read_csv(&quot;/kaggle/input/binary-classification-bank-churn-dataset-cleaned/train_cleaned.csv&quot;)<br><br><br>from sklearn.linear_model import LogisticRegression<br>from sklearn.model_selection import train_test_split<br><br><br>X= train[[&#39;Gender&#39;,&#39;Balance&#39;,&#39;NumOfProducts&#39;,&#39;IsActiveMember&#39;,&#39;Geography_France&#39;,&#39;Geography_Germany&#39;,&#39;Geography_Spain&#39;,&#39;Age_bin&#39;]].values.tolist()<br><br>Y = train[[&#39;Exited&#39;]].values.tolist()<br><br><br>X_train , X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.7)<br><br>lr = LogisticRegression()<br><br>lr.fit(X_train,Y_train)<br><br>ypred = lr.predict(X_test)<br><br><br>from sklearn.metrics import classification_report<br><br>#Classification Report<br>print(classification_report(Y_test,ypred))</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/714/1*3tWBG0AT-buh1p5HTh5lhw.png" /></figure><p><strong>Implementation</strong></p><pre>from sklearn.metrics import roc_curve,auc,roc_auc_score<br><br>import matplotlib.pyplot as plt<br><br>y_prob = lr.predict_proba(X_test)[ :, 1]<br><br>#getting the FPR, TPR at different Threshold<br>fpr, tpr, threshold = roc_curve(Y_test,y_prob)<br><br>#Calculating Area under the Curve<br>roc_auc = roc_auc_score(Yte,y_prob)<br><br><br>plt.figure()<br><br>#Plotting Roc Curve Lines<br>plt.plot(fpr, tpr, color=&#39;blue&#39;, lw=2 , label=&#39;ROC Curve (area = %0.2f)&#39; % roc_auc)<br>plt.plot([0,1],[0,1], color=&#39;red&#39;,lw=2,linestyle=&#39;--&#39;)<br><br>plt.xlim([0.0,1.0])<br>plt.ylim([0.0,1.05])<br>plt.xlabel(&#39;False Positive Rate&#39;)<br>plt.ylabel(&#39;True Positive Rate&#39;)<br>plt.title(&#39;Receiver Operating Characteristic (ROC) Curve&#39;)<br>plt.legend(loc=&quot;lower right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/890/1*RpnZ5M16Qpd8rmI-QToZiw.png" /></figure><p><strong>Area Under the Curve is low, it means we need to do Feature Engineering on the dataset, or we can also try Hyperparameter tuning and try different models.</strong></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=6614177a39b0" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Similarity Metrics]]></title>
            <link>https://medium.com/@samarrathore482/similarity-metrics-6950eded6116?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/6950eded6116</guid>
            <category><![CDATA[clustering]]></category>
            <category><![CDATA[euclidean-distance]]></category>
            <category><![CDATA[similarity-metrics]]></category>
            <category><![CDATA[numpy]]></category>
            <category><![CDATA[machine-learning]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Wed, 09 Oct 2024 04:02:48 GMT</pubDate>
            <atom:updated>2024-11-19T15:17:41.334Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, We are going to cover most important types of Similarity metric in machine learning with examples.</p><p>Similarity Metric is a measure that is used to calculate how similar two data points are in a vector space, where different dimensions represent different attribute of the data, It is based on data points distance from each other, Less the distance from each other more similar those data point will be and vice versa.</p><p>It is mostly used in clustering for calculating algorithms for calculating the distance between data points, and it helps in dividing data into different clusters. In NLP, we used it for calculating similarity between vectors in a higher dimension space. In Dimensionality reduction algorithms, we use similarity metrics for replicating the relationship from higher to lower dimension. It is also used in algorithms like KNearsetNeighbour for calculating nearest points from a given point and making prediction based on it.</p><p><strong><em>Note :I have included the examples based on NumPy and sklearn.</em></strong></p><h3><strong>Different Similarity Metrics</strong></h3><p><strong>1: Cosine Similarity :</strong></p><p>Cosine similarity is the measure of similarity between two no zero vectors in an inner product space, using cosine angle as the measure of similarity. It is the dot product of the vectors divided by the product of their lengths.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/719/1*aPjDfi75vR9sLUOH89T-bw.png" /><figcaption>Cos Sim formula</figcaption></figure><p><strong>Where A and B are the vectors</strong></p><p><strong>Where A.B is the dot product of the vectors</strong></p><p><strong>||A|| and ||B|| are the magnitudes of A and B</strong></p><p><strong>Examples</strong></p><pre>import numpy as np # linear algebra<br>A = [2, 3, 4]<br>B = [1, 0, 5]<br><br>AB = np.dot(A,B)<br><br><br>Anorm = np.linalg.norm(A)<br><br>Bnorm = np.linalg.norm(B)<br><br>CosSim = round(AB / (Anorm * Bnorm ),3)<br><br>print(CosSim)</pre><p><strong>Output</strong></p><p>0.801</p><pre>from sklearn.metrics.pairwise import cosine_similarity<br><br>A = np.array(A).reshape(1,-1)<br><br>B = np.array(B).reshape(1,-1)<br><br>Cos_sim = cosine_similarity(A,B)<br><br>print(round(Cos_sim[0][0],3))</pre><p><strong>Output</strong></p><p>0.801</p><p><strong>2: Euclidean Distance Matrix:</strong></p><p>It measures the Euclidean distance between two data points in a Euclidean space using the co-ordinates of the data point.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/729/1*HZermPfTBdaHucaSokcV_g.png" /><figcaption>Euclidean Formula</figcaption></figure><p><strong>Examples</strong></p><pre>A = np.array([2, 3, 4])<br>B = np.array([1, 0, 5])<br><br>np_Euclidean = np.linalg.norm(A - B)<br><br>print(round(np_Euclidean,3))</pre><p><strong>Output</strong></p><p>3.317</p><pre>from sklearn.metrics.pairwise import euclidean_distances<br><br>A = np.array(A).reshape(1,-1)<br><br>B = np.array(B).reshape(1,-1)<br><br>sk_euclidean = euclidean_distances(A,B)<br><br>print(round(sk_euclidean[0][0],3))</pre><p><strong>Output</strong></p><p>3.317</p><pre>Array = np.array([[1,2,3],<br>                 [4,5,6],<br>                 [2,5,7]])<br><br>sk_euclidean = euclidean_distances(Array)<br><br>print(sk_euclidean)</pre><p><strong>Output</strong></p><p>[[0. 5.19615242 5.09901951]<br> [5.19615242 0. 2.23606798]<br> [5.09901951 2.23606798 0. ]]</p><p><strong>3: Manhattan (L1) Distance Matrix :</strong></p><p>It measures the absolute distance between the data points in a vector space. It uses grid lines two calculate the distance. It is also called L1 or TaxiCab distance.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/629/1*ABWTJ1dzypQEKbx7fXTpEA.png" /><figcaption>Formula</figcaption></figure><p><strong>Examples</strong></p><pre>from sklearn.metrics.pairwise import manhattan_distances<br><br>Array = np.array([[1,2,3],<br>                 [4,5,6],<br>                 [2,5,7]])<br><br>sk_manhattan = manhattan_distances(Array)<br><br>print(sk_manhattan)</pre><p><strong>Output</strong></p><p>[[0. 9. 8.]<br> [9. 0. 3.]<br> [8. 3. 0.]]</p><pre>A = np.array([2,3,6])<br>B = np.array([5,7,8])<br><br><br>np_manhattan = np.sum(np.abs(A-B))<br><br>print(np_manhattan)</pre><p><strong>Output</strong></p><p>9</p><pre>A = np.array([2,7,6]).reshape(1,-1)<br>B = np.array([5,7,8]).reshape(1,-1)<br><br>sk_manhattan = manhattan_distances(A,B)<br><br>print(sk_manhattan[0][0])</pre><p><strong>Output</strong></p><p>5.0</p><p><strong>4 : Minkowski Distance :</strong></p><p>This metric is a generalized form of both metrics Euclidean Distance and Manhattan Distance.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/655/1*GFykpwya-UuNf_UBSorK_g.png" /><figcaption>Formula</figcaption></figure><p><strong>Explanation</strong></p><p><strong>A and B are vectors</strong></p><p><strong>n is the number of dimensions</strong></p><p><strong>p it defines the type of distance</strong></p><p><strong>if p = 1 it means Manhattan Distance</strong></p><p><strong>if p = 2 it means Euclidean Distance</strong></p><p><strong>Examples</strong></p><pre>from sklearn.metrics import pairwise_distances<br><br>A = np.array([2,7,6]).reshape(1,-1)<br>B = np.array([5,7,8]).reshape(1,-1)<br><br>sk_minkowski_manh = pairwise_distances(A,B,metric=&#39;minkowski&#39;,p=1)<br><br>sk_minkowski_euc = pairwise_distances(A,B,metric=&#39;minkowski&#39;,p=2)<br><br>print(round(sk_minkowski_manh[0][0],3))<br><br>print(round(sk_minkowski_euc[0][0],3))</pre><p><strong>Output</strong></p><p>5.0<br>3.606</p><pre>Array = np.array([[1,2,3],<br>                 [4,5,6],<br>                 [2,5,7]])<br><br>sk_minkowski_manh = pairwise_distances(Array,metric=&#39;minkowski&#39;,p=1)<br><br>sk_minkowski_euc = pairwise_distances(Array,metric=&#39;minkowski&#39;,p=2)<br><br>print(sk_minkowski_manh)<br><br>print(sk_minkowski_euc)</pre><p><strong>Output</strong></p><p>[[0. 9. 8.]<br> [9. 0. 3.]<br> [8. 3. 0.]]</p><p>[[0. 5.19615242 5.09901951]<br> [5.19615242 0. 2.23606798]<br> [5.09901951 2.23606798 0. ]]</p><pre>A = np.array([2,7,6]).reshape(1,-1)<br>B = np.array([5,7,8]).reshape(1,-1)<br><br><br>def minkowski(A,B,P):<br>    return np.power(np.sum(np.abs(A-B)**P),1/P)<br><br>np_minikowski_manh = minkowski(A,B,1) <br><br><br>np_minikowski_euc = minkowski(A,B,2) <br><br>print(np_minikowski_manh)<br><br>print(round(np_minikowski_euc,3))</pre><p><strong>Output</strong></p><p>5.0<br>3.606</p><p><strong>5 : Hamming Distance :</strong></p><p>It is used to measure the distance between two strings of equal length by counting the number of positions the strings differ.</p><pre>from sklearn.metrics import pairwise_distances<br>import numpy as np<br><br># Define the array<br>Array = np.array([[1, 2, 3],<br>                  [4, 5, 6],<br>                  [2, 5, 7]])<br><br># Compute pairwise Hamming distance<br>hamming_distance_sklearn = pairwise_distances(Array, metric=&#39;hamming&#39;)<br><br>print(&quot;Hamming Distance Matrix using sklearn:\n&quot;, hamming_distance_sklearn)<br><br><br><br>A = np.array([1,4,5]).reshape(1,-1)<br>B = np.array([1,2,5]).reshape(1,-1)<br><br>hamming_distance_sklearn = pairwise_distances(A,B, metric=&#39;hamming&#39;)<br><br>print(&quot;Hamming Distance Matrix using np:\n&quot;, round(hamming_distance_sklearn[0][0],3))</pre><p><strong>Output</strong></p><p>Hamming Distance Matrix using sklearn:<br> [[0. 1. 1. ]<br> [1. 0. 0.66666667]<br> [1. 0.66666667 0. ]]</p><p>Hamming Distance Matrix using np:<br> 0.333</p><p>— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -</p><p>These are the most important metric.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=6950eded6116" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Clustering in Machine Learning]]></title>
            <link>https://medium.com/@samarrathore482/clustering-in-machine-learning-23a58d11b9aa?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/23a58d11b9aa</guid>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[cluster]]></category>
            <category><![CDATA[clustering-algorithm]]></category>
            <category><![CDATA[deep-learning]]></category>
            <category><![CDATA[clustering]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Fri, 04 Oct 2024 05:54:05 GMT</pubDate>
            <atom:updated>2024-10-09T04:10:14.144Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, Here we are going to cover one of the most important topics in Machine Learning that is <strong>Clustering</strong>.</p><p><strong>Let&#39;s start with the question of what is clustering and why do we need it.</strong></p><p>In layman terms, if we try to understand clustering, it is the process of grouping data together in a cluster on the bases of one or more common features.</p><p><strong>Definition :</strong> Clustering is an unsupervised machine learning algorithm. Which is used to form groups of homogeneous data from the dataset of heterogeneous data. This approach is different from Regression and Classification algorithms because here we are not try to predict output value based on input values, we are forming groups based on similarity. We use different distance metrics like Euclidean distance, Cosine similarity, Manhattan distance, etc. for calculating the similarity between different data points. If you want to read more about similarity metric, you can <a href="https://medium.com/@samarrathore482/similarity-metrics-6950eded6116">click here</a></p><p>We can use clustering in different task like Customer Segmentation, Social Media Analysis, Recommendation Engine, Market Analysis etc.</p><p><strong>Now let&#39;s check the different type of Clustering and important methods inside it.</strong></p><p>First, we will import the necessary libraries and make a dummy dataset.</p><pre>from sklearn import datasets<br><br>#Dataset<br>sample = datasets.make_circles(n_samples=900,noise=0.9,random_state=80,shuffle=False)<br>X = sample[0]<br>Y = sample[1]<br><br>#Necessary imports<br>from sklearn.cluster import MeanShift, OPTICS, DBSCAN, Birch, KMeans, AgglomerativeClustering<br>from sklearn.mixture import GaussianMixture</pre><p><strong>1 : Partition Based Clustering :</strong></p><p>In this type of clustering we pass the number of clusters we want, and it uses different similarity metrics like Euclidean distance, Manhattan distance, Cosine Similarity etc. for dividing the data into different clusters.</p><p><strong>Methods in Partition based Clustering</strong></p><p><strong>A : K-Means :</strong></p><p>This method works by measuring the similarity between the data points, We pass the k this is the number of cluster we want, and it is automatically adjusting the data point to different clusters based on its distance from the centroid. <strong>This algorithm is sensitive to outliers.</strong></p><p>For finding the optimum number of clusters, we can use elbow method. This method works by plotting the variance explained by different number of clusters on a line graph and finding the elbow point of the graph. It works by calculating the WCSS (Within-Cluster Sum of Squares) that is the sum of square distance of data points from the cluster centroid.</p><p><strong>Code here</strong></p><pre>wss = []<br>for i in range(1,40):<br>    km = KMeans(n_clusters=i,max_iter=5)<br>    km.fit(X)<br>    wss.append(km.inertia_)<br><br>plt.figure(figsize=(12,4))<br>plt.plot(wss)<br>plt.xlabel(&#39;No of clusters&#39;)<br>plt.ylabel(&#39;WCSS&#39;)<br>plt.title(&#39;Elbow Plot&#39;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*X4n4UDrtZO1D6jKbzccNHw.png" /><figcaption>WCSS Plot</figcaption></figure><p>For this example, we will choose 10 as number of clusters to avoid complexity.</p><pre>kme = KMeans(n_clusters=10)<br>y = kme.fit_predict(X)<br><br>plt.figure(figsize=(15,6))<br>for i in range(0,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],color=colors[i],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;KMeans&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*r5O86VYIF7hW9Il0_wuZ_w.png" /><figcaption>K Mean</figcaption></figure><p>Other important Algorithm you can check</p><p><strong>B : K-Medoids</strong></p><p><strong>2 : Density Based Clustering Algorithm:</strong></p><p>This method works by finding the high density reasons in a data and grouping those data points in a cluster. We don’t need to provide the number of clusters in the data. It automatically decides it on its own.</p><p><strong>Methods in Density Based Clustering</strong></p><p><strong>A : DBSCAN :</strong></p><p>It stands for Density Based Spatial Clustering of Application with Noise. It is based on the principal, clusters are the dense regions in a dataset separated by the sparse region because of that clusters are not of similar size. Furthermore, it works well on the dataset having noise values. <strong>Elipson</strong> and <strong>min_samples </strong>are the two important parameters in this algorithm.</p><p><strong>Code here</strong></p><pre>dbscan = DBSCAN(eps=0.3,metric=&#39;manhattan&#39;,min_samples=20)<br>y = dbscan.fit_predict(X)<br><br>plt.figure(figsize=(14,6))<br>for i in range(-1,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;DBSCAN&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*AH5XnsbBR2ONQPj-mFZU8A.png" /><figcaption>DBSCAN</figcaption></figure><p><strong>B : Mean Shift :</strong></p><p>This is a non-parametric algorithm. We don’t need to pass the number of clusters we want, it decides on its own. It is also known as Mode Seeking Algorithm. It works by iteratively shifting the Mean of the cluster towards the densest area of the cluster. For deciding the densest area of the clusters, it uses a kernel function.</p><p><strong>Code here</strong></p><pre>Ms = MeanShift(bin_seeding=True,bandwidth=0.8)<br>y = Ms.fit_predict(X)<br><br>plt.figure(figsize=(14,6))<br>for i in range(0,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;Mean Shift&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*BOoKSHZo_A8PJzFtH3eLGA.png" /><figcaption>Mean Shift</figcaption></figure><p>Other important Algorithm you can check</p><p><strong>C : OPTICS</strong></p><pre>opt = OPTICS(min_samples=20,metric=&quot;euclidean&quot;,xi=0.02,eps=0.5)<br>y = opt.fit_predict(X)<br><br>plt.figure(figsize=(14,6))<br>for i in range(-1,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;OPTICS&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gCyZ4_yPybH8c0dd1Zf0QQ.png" /><figcaption>Optics</figcaption></figure><p><strong>3 : Connectivity Based Clustering (Hierarchical Clustering ):</strong></p><p>This method forms a tree like structure. Each data point is assumed as a separate cluster on the X axis, which is then joined together based on the similarity with other clusters. It forms a Dendogram.</p><p><strong>A : Agglomerative Hierarchical Clustering</strong>:</p><p>It is a Bottom Up approach here each data point is considered as individual clusters and with each level it joins the cluster based on similarity and finally forms a single cluster at Top</p><p><strong>Code</strong></p><pre>gm = AgglomerativeClustering(n_clusters=4)<br>y = gm.fit_predict(X)<br><br>plt.figure(figsize=(12,5))<br>for i in range(0,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],color=colors[i],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;Agglomerative Clustering&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Tl6-Hfp86_7HtMognT49Pw.png" /><figcaption>Agg example</figcaption></figure><p><strong>B : Divisive Hierarchical Clustering :</strong></p><p>This method is opposite of Agglomerative clustering. You can think of this as Top-down approach, it starts with a single cluster and recursively splitting the cluster based on the dissimilarity</p><p>Other important Algorithm you can check</p><p><strong>C : BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)</strong></p><pre>bir = Birch(threshold=0.4, branching_factor=50,n_clusters=5)<br>y = bir.fit_predict(X)<br><br>plt.figure(figsize=(12,5))<br>for i in range(0,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;Birch&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*lT3_YI5Q6dxSAh5KIs43uw.png" /><figcaption>Birch</figcaption></figure><p><strong>4 : Distribution Based Clustering :</strong></p><p>This clustering approach takes a totally different metric into consideration, that is probability. It considers the probability of a data point belonging to a probability distribution. The higher the distance of the data point from the center point of the cluster, the lesser the chance of data point belong to that cluster.</p><p><strong>A : Gaussian Mixture Models (GMM) :</strong></p><p>This clustering method assumes data is comprising a Gaussian Distribution. The probability of a data point belonging to a cluster depends on its distance from the center of the cluster, higher the distance lesser the chance, It is a statistical inference clustering technique.</p><pre>gm = GaussianMixture(n_components=4, covariance_type=&#39;diag&#39;, random_state=42)<br>y = gm.fit_predict(X)<br><br>plt.figure(figsize=(12,5))<br>for i in range(0,len(set(y))):<br>    plt.scatter(X[y==i,0],X[y==i,1],label=f&#39;Cluster {i}&#39;)<br>    <br>plt.title(&#39;Gaussian Mixture&#39;)<br>plt.xlabel(&#39;X-axis&#39;)<br>plt.ylabel(&#39;Y-axis&#39;)<br>plt.legend(loc=&quot;upper right&quot;)<br>plt.show()</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*u5oVMGuQz2Gaj8-mtql4Yw.png" /><figcaption>Gauss</figcaption></figure><p>So we have covered the important clustering methods with examples.</p><p>Hope you have ready and liked it.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=23a58d11b9aa" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Leaving Behind Yourself]]></title>
            <link>https://medium.com/@samarrathore482/leaving-behind-yourself-43974508c6bd?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/43974508c6bd</guid>
            <category><![CDATA[life]]></category>
            <category><![CDATA[growth]]></category>
            <category><![CDATA[mindset]]></category>
            <category><![CDATA[struggle]]></category>
            <category><![CDATA[change]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Sun, 29 Sep 2024 04:11:23 GMT</pubDate>
            <atom:updated>2024-10-08T07:30:15.860Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, Recently I have come across a question <em>“</em><strong><em>What you are leaving behind for the purpose of growth. Is it worth it, or being where you are is a better option.”</em></strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*0hqg7yVHkE9aKN3hEM1d-g.jpeg" /></figure><p>When you think about those situations, where you have the opportunity to move to a new city or a country, there is always a question in mind is it worth it. Am I doing something wrong, do I have to stay here with my family, friends and try to do things within my town where I grew up and know everything about, do i able to survive in the new environment?</p><p>The other side is, If you don’t move to a different town or country, you won’t be able to gain the much-needed experience in your life. Being on the road, always moving, makes you feel you are learning something, making new connections and growing. It helps you deal with your weakness, it makes you stronger, it makes you tougher and smarter. You learn to live by yourself, and you get the confidence you needed.</p><p>But leaving everything behind is more of an emotional and mental journey because you are not just leaving a city or a place you are leaving behind the memories you have created over the years, you are leaving behind your loved ones. You will be alone in a city of a million people. Even when you feel like talking to someone, you won’t have anyone close to you. There might be days when you are sitting alone for days in your room and feeling like what am I doing.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*fhBEKWtzbGIAAq4L5XGulA.jpeg" /></figure><p>This choice seems easy for someone who is in their younger age. Hot blood full of passion and energy is moving through your veins, and you decide to move out, you want to experience the freedom of living alone, You want to have a life where no one is controlling you or affecting your decision. But with this freedom comes responsibilities it is not easy to live alone you have to pay the price, and you have to learn and adapt faster. Because in every new city you go to, you will find people who are ready to robe you, take advantage of you.</p><p>But when you grow old and started to get the understanding of life and how things work. You get to know the importance of family and friends, because these are not just titles we give to anyone. These are the people who understand us and will be available for us when needed in a critical situation.</p><p>We can’t say one choice is better than the other, because if you won’t move to a new place. If you don&#39;t leave your past behind and take new challenges, you won’t be able to learn new things. Taking on new challenges in life is important. It will grow your understanding of life, make you stronger and smarter. You learn how to make relationships, and it will improve your understanding of life.</p><p>As you start growing older, you will understand. Taking risks is fun but having a stable and peaceful life is more important because whatever we are doing in life the end goal of everyone is to have a fulfilled life and this is only achieved when we have calmness in our life.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*UtYBS8ikED-BhyKEHtPUmA.jpeg" /></figure><p>This cycle of leaving yourself behind it like the cycle of destruction and regeneration. We only get to learn what we want when we have nothing in life, and leaving yourself behind is part of that self search. So you have to decide when you are ready to take risk in your life and when you are ready to choose a calm path and enjoy everything around you.</p><p><strong><em>Love whatever you have because once it is gone you won’t get it again.</em></strong></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=43974508c6bd" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[NLP Vectorization and Embedding story]]></title>
            <link>https://medium.com/@samarrathore482/nlp-vectorization-and-embedding-story-4645c2b09424?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/4645c2b09424</guid>
            <category><![CDATA[vectorization]]></category>
            <category><![CDATA[nlp]]></category>
            <category><![CDATA[nlp-preprocessing]]></category>
            <category><![CDATA[embedding]]></category>
            <category><![CDATA[text-preprocessing]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Wed, 25 Sep 2024 05:32:51 GMT</pubDate>
            <atom:updated>2024-10-02T03:24:51.340Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, We are going to cover one of the most important topics in NLP that is Vectorization or you can say Word Embedding. These methods are serving the same purpose, but the techniques that we use are different. We are going to cover the different techniques and show code snippets on how to implement them.</p><p>First, Let&#39;s start with understanding what is Vectorization and Embeddings</p><p>Both Vectorization and Embedding are the two sides of the same coin. In Vectorization, we try to represent the words with numbers. We can consider Vectorization as the Statistical methods for converting words in numbers, it involves simple and easy to understand techniques like Bag of Words, TF-IDF etc.</p><p>While embedding is a more advance approach, we try to convert the word into a dense vector to represent the meaning of the words and those words that are similar to each other are closer in vector space. For getting embedding either we can use pretrained embedding like C BOW, Skip-gram etc. or we can get our custom embeddings.</p><p>Now we are going to check the code for different techniques that we can use.</p><p>The Dataset for the below examples will be</p><pre>import pandas as pd<br><br># Define some categories<br>categories = [&#39;Technology&#39;, &#39;Science&#39;, &#39;Business&#39;, &#39;Health&#39;, &#39;Education&#39;]<br><br># Create 50 sentences, each under 50 words<br>sentences = [<br>    &quot;Artificial intelligence is transforming industries by automating complex processes.&quot;,<br>    &quot;The study of quantum mechanics continues to challenge our understanding of the physical world.&quot;,<br>    &quot;Startups are reshaping the financial industry with innovative fintech solutions.&quot;,<br>    &quot;Nutrition and exercise are key components of maintaining good health and well-being.&quot;,<br>    &quot;Online learning platforms are revolutionizing education, making knowledge accessible to all.&quot;,<br>    &quot;5G technology is expected to significantly enhance mobile network speeds and connectivity.&quot;,<br>    &quot;Advances in genetic engineering could lead to breakthrough treatments for inherited diseases.&quot;,<br>    &quot;E-commerce is growing rapidly as more consumers shop online for convenience.&quot; <br>]<br><br># Assign a random category to each sentence<br>data = {&#39;Sentence&#39;: sentences, &#39;Category&#39;: [categories[i % len(categories)] for i in range(8)]}<br><br># Create DataFrame<br>df = pd.DataFrame(data)<br>df.head()  # Show first few rows of the DataFrame</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/794/1*o8ZtG1LBmVRt2SISY7PERg.png" /><figcaption>Original DataFrame</figcaption></figure><p>Below are the necessary libraries you have to import</p><pre>from sklearn.preprocessing import LabelEncoder,OneHotEncoder<br>from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer <br>from sklearn.feature_extraction import DictVectorizer<br>from transformers import BertTokenizer, BertModel<br>import gensim<br>from gensim.models import Word2Vec<br>import torch<br>from nltk.tokenize import word_tokenize<br>import tensorflow<br>from tensorflow import keras<br>from tensorflow.keras.preprocessing.text import Tokenizer<br>from keras.utils import pad_sequences<br>from keras.models import Sequential<br>from keras.layers import Dense, SimpleRNN , Embedding</pre><p>First, we will do the Label Encoding on the Categories column.</p><pre>lb = LabelEncoder()<br><br>df[&#39;Category&#39;]= lb.fit_transform(df[&#39;Category&#39;])<br>df[&#39;original_name&#39;] = lb.inverse_transform(df[&#39;Category&#39;])<br>df[&#39;Token&#39;] = df[&#39;Sentence&#39;].apply(word_tokenize)<br>df.head(5)</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Bcx_EL5MwjuiltOrRlUXZQ.png" /><figcaption>First Modification</figcaption></figure><p><strong>1 : CountVectorizer :</strong> In Count Vectorization, first we create a corpus of all the words that appears in a document, after that we convert them into vector using that corpus on the bases of if they appear in a sentence or not.</p><pre>cv = CountVectorizer()<br>dcv = cv.fit_transform(df[&#39;Sentence&#39;])<br>#print(cv.vocabulary_)<br>new = pd.DataFrame(dcv.toarray(),columns= cv.get_feature_names_out())<br>final = pd.concat([new,df],axis=1)<br>final</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qqh1iZcwg4XJuAfyKFin2g.png" /><figcaption>Count Vectorize</figcaption></figure><p><strong>2 : Ngram :</strong> You can think of this method as an updated version of Count Vectorization, In this method we have the flexibility to either take words as individual tokens or tuple of words of 1 to n size.</p><pre>bgw = CountVectorizer(ngram_range= (1,3))<br>bgwv = bgw.fit_transform(df[&#39;Sentence&#39;])<br>#print(cv.vocabulary_)<br>new = pd.DataFrame(bgwv.toarray(),columns= bgw.get_feature_names_out())<br>final = pd.concat([new,df],axis=1)<br>final</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Xrv2EnuAsEHykERWVJXriQ.png" /><figcaption>Ngrams</figcaption></figure><p><strong>3 : TF IDF :</strong> In this approach, we try to find the importance of a word in a document and the corpus of the documents. According to that, we assign some numerical representation to it.</p><pre>tf = TfidfVectorizer()<br>tfv = tf.fit_transform(df[&#39;Sentence&#39;])<br>#print(tf.vocabulary_)<br>new = pd.DataFrame(tfv.toarray(),columns= tf.get_feature_names_out())<br>final = pd.concat([new,df],axis=1)<br>final</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*fShW0ta8ENPyXV1ENNyvkA.png" /><figcaption>TF IDF</figcaption></figure><p><strong>4 : One Hot Encoding :</strong> This method is used for transformation of categorical data to numerical data. It creates binary columns (or vectors) for each unique category, with a 1 indicating the presence of a category and 0 for all other categories.</p><pre>ohe = OneHotEncoder(handle_unknown=&#39;ignore&#39;)<br>ohetok = ohe.fit_transform(np.array(df[&#39;Sentence&#39;]).reshape(-1,1))<br>new = pd.DataFrame(ohetok.toarray())<br>final = pd.concat([new,df],axis=1)<br>final</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*r87IlzPXTb38A5QG1oWTlw.png" /><figcaption>OHE</figcaption></figure><p>As we have covered, the important methods in Vectorization, let&#39;s check for methods in embedding.</p><p><strong>5 : C Bow (Continuous Bag Of Words) :</strong> The underlying process of this method is that we feed our neural network model with a stream of input words, and it tries to predict an output word that is closely related to the steam of input words we feed to the neural network.</p><pre>sentence = df[&#39;Token&#39;].tolist()<br><br>cbow = Word2Vec (vector_size=50,sg=0)<br>cbow.build_vocab(sentences)<br>cbow.train(sentences,epochs=5,total_examples=cbow.corpus_count)<br><br>def getembed(text_embd):<br>    <br>    word_embd = [ word for word in text_embd if word in cbow.wv.index_to_key]<br>    <br>    if len(word_embd) &gt; 0:<br>        return cbow.wv[word_embd].mean(axis=0)<br>    else:<br>        return None<br><br>df[&#39;cbow_emb&#39;] = df[&#39;Token&#39;].apply(getembed)</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*3QCiNHuekDx8LIbnzW46aQ.png" /><figcaption>CBOW</figcaption></figure><p><strong>6 : Skip gram :</strong> In this approach, we are trying to predict a continuous stream of words based on a given input words. We can think of this approach as the opposite of C Bow.</p><pre>sentence = df[&#39;Token&#39;].tolist()<br><br>skp = Word2Vec(sentence,vector_size=60,sg=1)<br>skp.build_vocab(sentence)<br>skp.train(sentence,epochs=5,total_examples=skp.corpus_count)<br><br>def skg_emb(tok):<br>    skg_emb = [word for word in tok if word in skp.wv.index_to_key]<br>    <br>    if len(skg_emb)&gt;0:<br>        return skp.wv[skg_emb].mean(axis=0)<br>    else:<br>        return None<br><br>df[&#39;skg_emb&#39;] = df[&#39;Token&#39;].apply(skg_emb)</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1020/1*1Ra86AEjXaCRMH9gVy0HUA.png" /><figcaption>skg</figcaption></figure><p>While using these methods, we can either train our model or we can also use pretrained models for it.</p><pre>import gensim.downloader as api<br><br>model = api.load(&#39;word2vec-google-news-300&#39;) <br><br>def pre_emb(tok):<br>    pre_emb = [word for word in tok if word in model.index_to_key]<br>    <br>    if len(pre_emb)&gt;0:<br>        return model[pre_emb].mean(axis=0)<br>    else:<br>        return None<br><br>df[&#39;Pre_emb&#39;] = df[&#39;Token&#39;].apply(pre_emb)</pre><p><strong>7 : Bert (Bidirectional Encoder Representation of Transformer) :</strong> This is a Transformer based model that uses the Encoder part of the Transformer, and it is introduced by the Google.</p><pre>tokenizer = BertTokenizer.from_pretrained(&#39;bert-base-uncased&#39;)<br>model = BertModel.from_pretrained(&#39;bert-base-uncased&#39;)<br><br>def bert_embd(text):<br>    token = tokenizer(text,return_tensors=&quot;pt&quot;,truncation=True,padding=True)<br>    output = model(**token)<br>    <br>    embed = torch.mean(output.last_hidden_state, dim=1)<br>    return embed.detach().numpy().flatten()<br><br>df[&#39;bert_embed&#39;] = df[&#39;Sentence&#39;].apply(bert_embd)</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*0EMTn4g7M6DE1fnLoM9OkQ.png" /><figcaption>BERT</figcaption></figure><p><strong>We can also use Deep Learning library like TensorFlow and Keras for it.</strong></p><pre>token = Tokenizer(oov_token=&#39;&lt;nothing&gt;&#39;)<br><br>token.fit_on_texts(df[&#39;Sentence&#39;])<br><br>seq = token.texts_to_sequences(df[&#39;Sentence&#39;])<br><br>output = pad_sequences(seq,maxlen=10,padding=&#39;post&#39;)<br><br>len(token.word_counts)<br><br>x_reshaped = output.reshape((output.shape[0], output.shape[1])) <br><br>encoder = OneHotEncoder(sparse=False)<br>y_encoded = encoder.fit_transform(df[[&#39;Category&#39;]]) <br><br><br>#Model 1 <br><br>model = Sequential()<br>model.add(SimpleRNN(32,activation=&#39;relu&#39;,input_shape=(10,1)))<br>model.add(Dense(5,activation=&#39;relu&#39;))<br>model.add(Dense(y_encoded.shape[1], activation=&#39;softmax&#39;)) <br>model.compile(loss=&#39;categorical_crossentropy&#39;,optimizer=&#39;adam&#39;)<br>model.fit(x_reshaped,y_encoded, epochs=10, batch_size=1)<br>model.summary()<br><br><br><br>#Model 2<br><br><br>model = Sequential()<br>model.add(Embedding(input_dim=78, output_dim=32, input_length=10)) <br>model.add(SimpleRNN(16,activation=&#39;relu&#39;))<br>model.add(Dense(5,activation=&#39;relu&#39;))<br>model.add(Dense(y_encoded.shape[1], activation=&#39;softmax&#39;)) <br>model.compile(loss=&#39;categorical_crossentropy&#39;,optimizer=&#39;adam&#39;)<br>model.fit(x_reshaped,y_encoded, epochs=10, batch_size=1)<br>model.summary()</pre><p>We have covered all the important techniques. Thanks for reading</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4645c2b09424" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[10 Ways To Get out of Doom Scrolling]]></title>
            <link>https://medium.com/@samarrathore482/get-out-of-doom-scrolling-2aed2ca52a1b?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/2aed2ca52a1b</guid>
            <category><![CDATA[addiction]]></category>
            <category><![CDATA[doomscrolling]]></category>
            <category><![CDATA[social-media]]></category>
            <category><![CDATA[scrolling]]></category>
            <category><![CDATA[change]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Sun, 22 Sep 2024 03:34:11 GMT</pubDate>
            <atom:updated>2024-09-25T05:45:21.304Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, you’ve probably heard of or experienced <strong>doom-scrolling</strong>. It’s one of those “gifts” of our interconnected world that we didn’t ask for, yet somehow got stuck with. Today, I’m going to share some personal experiences and methods that have helped me break free from doom-scrolling.</p><h3>What is Doom Scrolling and Its Effects on Your Mind and Health?</h3><p>Doom-scrolling is the <strong>endless scrolling</strong> through social media apps like Facebook, Instagram, Twitter, Snapchat etc. It affects both your brain and overall physical health because, when you’re addicted to these apps, you scroll endlessly <strong>without any real purpose</strong>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/495/1*HjORikUTxaizm6_iKwJyRg.png" /><figcaption>Doom eye</figcaption></figure><p>If you spend too much time doom-scrolling, it can have a <strong>negative impact</strong> on your emotional and physical health. You start feeling <strong>disconnected from reality</strong> and may use the virtual world as an escape from your real-life problems. This not only dulls your sensitivity toward real-life social interactions but can also leave you feeling like you’re not achieving anything in your life.</p><p>There might be days when you think, <em>“What am I doing with my life? What have I accomplished? Am I worth anything?”</em> You may find yourself <strong>lying in bed</strong> for hours, endlessly scrolling through your social media feed, while life passes you by. Over time, this has a <strong>serious negative impact</strong> on your mental health, making you feel <strong>worthless</strong>, and it also harms your physical health by promoting a sedentary lifestyle.</p><p>If this habit continues unchecked, it can trigger a <strong>chain reaction</strong> of negativity, further isolating you from the real world, <strong>cutting you off from meaningful social interactions</strong>, and even causing you to withdraw from society altogether.</p><h3>10 Methods to Stop Doom Scrolling:</h3><p>Here are some <strong>effective hacks</strong> that helped me break free from the cycle of doom-scrolling:</p><ol><li><strong>No-Touch Morning:</strong><br>The rule here is simple: When you wake up in the morning, <strong>don’t touch your phone</strong> for the first two hours. Instead, engage in activities like going for a jog or walk, reading a book, doing yoga, or exercising. This helps you start the day on a <strong>positive note</strong>, making you feel energized and motivated. The key is to avoid lounging afterward — don’t sink into your couch or bed and start scrolling again. To prevent this, move on to the next hack.</li><li><strong>Create a Checklist:</strong><br>Once your morning routine is done, create a <strong>daily checklist</strong> of activities. The tasks don’t have to be completed in a strict order, but you should aim to finish them all by the end of the day. Your list can include anything — reading a chapter of a book, meeting a friend, cooking a new dish, working on a hobby, exercising, or engaging in sports. This keeps your day <strong>structured</strong> and gives you a sense of accomplishment, reducing your urge to mindlessly scroll.</li><li><strong>Monitor Your Phone Usage:</strong><br>Use screen time monitoring apps to track how much time you’re spending on social media. If you go over a certain limit, these apps will <strong>block access</strong> to those platforms for the rest of the day. This simple strategy can help you become more aware of your scrolling habits and set <strong>boundaries</strong> for healthier usage.</li><li><strong>Self-Monitoring:</strong><br>This is about consciously observing the <strong>decisions you’re making</strong> throughout the day. Pay attention to the <strong>triggers</strong> that lead you to pick up your phone and start scrolling. For example, I noticed that on days when I start my morning with doom scrolling, I usually end up spending <strong>most of the day</strong> on my phone. By recognizing your own triggers, you can take steps to avoid them.</li><li><strong>Social Media Detox:</strong><br>If none of the above works, consider going for a full <strong>social media detox</strong>. Remove all social media apps from your phone, at least temporarily. This is a <strong>drastic measure</strong>, and it might make you feel anxious or experience cravings for social media, but if you can hold out for a few days, it can help you break the habit. However, this may not be for everyone, especially if your work depends on social media.</li><li><strong>Adopt a New Hobby:</strong><br>Learning something new can help <strong>distract you</strong> from the urge to scroll and provide a healthy outlet for your energy. Whether it’s painting, writing, playing sports, hiking, or learning a musical instrument, hobbies can help you <strong>reconnect with the real world</strong> and give you a sense of fulfillment.</li><li><strong>Set Boundaries for Social Media:</strong><br>You can establish a rule of using social media only at specific times of the day, say for <strong>30 minutes</strong> in the evening. This way, you don’t completely cut it out but limit your exposure. Turn off <strong>notifications</strong> to avoid being constantly pulled back in.</li><li><strong>Engage in Real-Life Social Interactions:</strong><br>Make an effort to <strong>spend time with people</strong> in person — catch up with friends, go for a walk with family, or participate in group activities. Real-life social connections help break the habit of constantly seeking validation and interaction online.</li><li><strong>Use the Pomodoro Technique:</strong><br>Another helpful method is the <strong>Pomodoro technique</strong>, where you focus on a task for 25 minutes, then take a 5-minute break. This structure helps you stay productive and minimizes the time spent on distractions like social media.</li><li><strong>Practice Mindfulness and Meditation:</strong><br>Sometimes, the urge to doom scroll comes from a desire to escape feelings of stress or anxiety. Practicing <strong>mindfulness</strong> or meditation can help you become more present and aware of your surroundings, reducing the compulsion to check your phone out of boredom or anxiety.</li></ol><p>These are the strategies that have been most effective for me in combating doom-scrolling. If you’re looking to break this habit, start small, and be patient with yourself. Let’s put our phones down, look out at the sky, and reconnect with the world around us!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2aed2ca52a1b" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Creating First Amazon EC2 instance]]></title>
            <link>https://medium.com/@samarrathore482/creating-first-amazon-ec2-instance-19b886066071?source=rss-b94ca56679f2------2</link>
            <guid isPermaLink="false">https://medium.com/p/19b886066071</guid>
            <category><![CDATA[aws-ec2]]></category>
            <category><![CDATA[cloud]]></category>
            <category><![CDATA[cloud-services]]></category>
            <category><![CDATA[cloud-computing]]></category>
            <category><![CDATA[aws]]></category>
            <dc:creator><![CDATA[Kamal Singh Rathore]]></dc:creator>
            <pubDate>Thu, 19 Sep 2024 02:46:50 GMT</pubDate>
            <atom:updated>2024-11-18T15:01:14.235Z</atom:updated>
            <content:encoded><![CDATA[<p>Hi everyone, In this article we are going to cover Amazon Web Services, Amazon EC2 and I will provide a step-by-step guide for creating an Amazon EC2 instance.</p><p>Let&#39;s start first with the Question of What is Cloud and Cloud Computing.</p><p>Cloud is the global network of servers interconnects to provide seamless services to the users. It is not a physical entity, instead it is a vast ecosystem of remote servers around the globe which are hooked together and operates as a single ecosystem.</p><p>Now, as we have a clear picture about what is Cloud, let&#39;s look into Cloud Computing.</p><p>These are the platform which provides on demand services to the users. They offer variety of services which includes cloud storage, availability of computer system resources, computing power, Servers, Networking, Analytics etc. The most famous Cloud platforms at present are AWS, Microsoft Azure and GCP.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*RYOt9jLcwkWANpF7C3WU6g.png" /><figcaption>Photo by spiceworks.com</figcaption></figure><h4><strong>Now the Question is what is AWS and what services it offers.</strong></h4><p>Amazon Web Services (AWS) is a service on demand platform introduced by Amazon. It works on <strong>pay-as-per-use</strong> policy, where you only have to pay for the resources you are using. We can customize the services according to our needs.</p><p>Top services offered by AWS.</p><ul><li>1. Amazon EC2 (Elastic Cloud Compute)</li><li>2. Amazon RDS (Relational Database Services)</li><li>3. Amazon S3 (Simple Storage Service)</li><li>4. Amazon EBS (Elastic Block Store)</li><li>5. Amazon Lambda</li><li>6. Amazon CloudFront</li><li>7. Amazon SNS (Simple Notification Service)</li><li>8. Amazon VPC (Virtual Private Cloud)</li><li>9. Amazon Auto-Scaling</li><li>10. Amazon Elastic Beanstalk</li></ul><p>Now, we have gone through the basic. Let&#39;s understand EC2 and create a EC2 instance.</p><h4><strong>EC2 Intro</strong></h4><p>Amazon Elastic Compute Cloud eliminates your need to invest in hardware up front. They provide scalable computing capacity ranging from latest processors, operating system, Storage, Networking also a purchase model to scale the system requirements in future to help to better manage your workload.</p><p><strong>Let&#39;s start with the demo, we will go with the free tier configuration.</strong></p><p>Hope you have an AWS account, if not, please create one and search for EC2. This will open E2 Dashboard and click on Launch instance at the bottom to launch a new instance, or you can click on Instances(running) to check the active instances.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*jzCDvT0kis-LJx8XaACewg.png" /><figcaption>EC2 home page</figcaption></figure><p>It will open the EC2 instance Launch page. Name your instance as shown below.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*8duRbjrfyPSmxFrCow0ipw.png" /><figcaption>Name your instance</figcaption></figure><p>Now choose your Amazon Machine, either you can go with free tier or browse and select other AMI according to your requirement.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/965/1*nDBWPR3ByI215BgGVZcsNg.png" /></figure><p>We will select the instance type now. You can choose free tier or paid one according to your needs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/966/1*3yZcGfDykhP228X5dpnGRg.png" /><figcaption>Instance Type</figcaption></figure><p>We will select a existing Key pair or download a new Key pair. These Key are helpful for connecting to the instance.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/966/1*v1Qh2m89gcoRm7QCbbkIYw.png" /><figcaption>Key pair option</figcaption></figure><p>Below is the prompt for creating a New Key. Set the name for your key and choose key format after that click on Create Key Pair. It will download a file that will help you to connect with AMI from your device.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/708/1*ada7tkSUFGGBc8TkrWrWzw.png" /></figure><p>After that, you have to update the Network settings. Choose the security group and which type of network request you want to allow also you can choose default IP or Custom IP.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/960/1*LE1qBWaXjwcbxsEdw0-5lQ.png" /><figcaption>Network setting</figcaption></figure><p>You are all set to launch the instance. Click on Lunch Instance Button on bottom right. Now you will have your instance running and check it on instance page.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*jGq2lftl4W-4Xgic_V5cZg.png" /><figcaption>instance page</figcaption></figure><p>We are all set to connect to the instance. Click on instance and it will open a instance detail page. At the top right you will find the connect option. Click on it.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*M8bSmlGqL7DtN5TMSZ5Beg.png" /><figcaption>Connect option</figcaption></figure><p>From EC2 instance connect, you can directly connect to the AMI with the connect button at the bottom.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/863/1*NCrH-26J4KIMqR95fqrBCQ.png" /><figcaption>Connect page</figcaption></figure><p>If you want to connect from your device, Go to the SSH Client option and there you will find a command like below given one.</p><p><strong>ssh -i “keytest.pem” ec2-user@ec2–xx–xxx–xx–xxx.us-west-2.compute.amazonaws.com</strong></p><p>You have to copy that command and go to your command prompt. Now go inside the folder where you have saved your key that we have downloaded and run your command.</p><p>We have created the instance and launched it. Now let&#39;s check how to delete it.</p><p>Go to your instance page and select the instance you have to terminate. Check below image for reference.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*kQINeKXJ1DG7GplAIjgePA.png" /></figure><p>Now you are all set to go and create your first EC2 instance.</p><p>All the best!!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=19b886066071" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>