Image for post
Image for post
Photo by Markus Winkler on Unsplash

How do the Unsupervised Learning Algorithms work?

Unsupervised algorithms are regarded as self-learning algorithms that possess the capacity to explore and locate the previously unknown patterns in a dataset. They are one of the most used machine learning algorithms as they do not need a labeled dataset to operate. The unsupervised algorithms are widely used to detect anomalies and defects in the dataset, as the anomalies won’t match the normal pattern of the rest of the data distribution.

Types of Unsupervised Learning

There are two types of unsupervised learning algorithms mostly:

  1. Association Algorithm
  2. Clustering Algorithm


Image for post
Image for post
Photo by Marius Masalar on Unsplash

How do Decision Trees work?

Decision Trees are some of the most used machine learning algorithms. They are used for both classification and Regression. They can be used for both linear and non-linear data, but they are mostly used for non-linear data. Decision Trees as the name suggests works on a set of decisions derived from the data and its behavior. It does not use a linear classifier or regressor, so its performance is independent of the linear nature of the data. Boosting and Bagging algorithms have been developed as ensemble models using the basic principle of decision trees compiled with some modifications to overcome some important drawbacks of decision trees and provide better results. …


Image for post
Image for post
Photo by Campaign Creators on Unsplash

Part 2: Underlying Architecture of Distributed File Systems of MongoDB and HBase and Database Operations in Python

In this article, we will talk about the architecture based on which big data solutions are provided in today’s world, the distributed file systems, and see how they are actually implemented. In part 1 of this series, we have talked about the basic database concepts and the installation procedures. Please feel free to check that out as a prerequisite.

Both MongoDB and Hbase are Distributed NoSQL databases, used extensively to handle large data problems. Let’s jump into the architectural and implementation details.

MongoDB

MongoDB is a document-based NoSQL database. It requires no fixed schema definition. Mongo DB stores data as Binary JSON or BSON. It supports horizontal scaling. …


Image for post
Image for post
Photo by Campaign Creators on Unsplash

Getting Started

Part 1: Concepts and Installation: MySQL, HBase, MongoDB

In today’s world, we have to handle huge amounts of data and store it in a favorable way. Data are mostly generated today from social media sites like Facebook, and Twitter in huge volumes every day. Previously, we dealt with mostly structured data, i.e, data that can be contained in tabular structures, as a result, we used MySQL in all cases. In the current scenario, firstly, data is obtained and needs to be stored in an unstructured manner and secondly, it is obtained in huge volumes. So, it is impossible to store the whole contents of a database in one server, and multiple servers need to be accessed simultaneously. The current databases also need to fault-tolerant, i.e, …


Image for post
Image for post
Photo by Markus Spiske on Unsplash

Computer Vision, Deep Learning

Can we detect Covid-19 using technology?

The spread of the novel Coronavirus pandemic has affected several countries across the world. The number of deaths and affected is rising at a very huge rate. We have known that the main reason behind the spread is, it spreads through contact. So, we must keep the already affected in isolation to stop the spread. But the process of detection of the disease is a very hard and time-consuming process. All these factors need to be dealt with to reach a condition to stop the spread of the pandemic.

On the other hand, in recent times, it has come to light that, covid-19 does not affect all the patients equally. Some patients have mild symptoms and are at a much lower risk, while others have huge risks and need immediate medical attention. Currently, due to the huge patient load hospitals are failing to perform. So, it would be great if we can detect the risk factor associated with the patient. …


Image for post
Image for post
Photo by Franki Chamaki on Unsplash

The Maths Behind Regressions

Regression is one of the most important concepts used in machine learning. In this blog, we are going to talk about different types of regressions and the underlying concepts.

The variation in types of regressions are represented in the diagram:

Image for post
Image for post
Representation of regression in details

We are going to talk about all the points in detail.

Linear Regression

Regression tasks deal with predicting the value of a dependent variable from a set of independent variables i.e, the provided features. Say, we want to predict the price of a car. …


Image for post
Image for post
Photo by Markus Winkler on Unsplash

How can we improve our news reading experience easily?

News has always been a very significant part of our society. In the past, we mostly depended on the news channels and newspapers to get our feeds and keep ourselves updated. Currently, in the fast-paced world, news media and agencies have started using the internet to reach the readers. The venture has proven to be very helpful as it has allowed the houses to extend their reach among readers.

In the present world, there are numerous media outlets, so, it can be easily established that it is impossible for a person to go and gather news from all the outlets, owing to the busy life schedules. Besides, each media outlet covers each story differently. Some readers like to compare stories and read the same story from multiple houses to get the full idea of an event. All these requirements are solved by a type of application that is gaining popularity currently, Online News Distribution applications. These applications aim to gather news from multiple sources and provide to a user as a feed. …


Image for post
Image for post
Photo by Franck V. on Unsplash

#REINFORCEDSERIES

What is Reinforcement Learning?

“Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward.”

Reinforcement learning currently has turned out to be a very important topic of research and has found important applications in several domains. In the reinforced series we are going to take a look at all the concepts of Reinforcement Learning and understand the basic principle behind their working, with applications. This article aims to give an introduction to the topic and explain the basic terminologies associated with reinforcement learning.

Divisions of Machine…


Image for post
Image for post
Photo by Irvan Smith on Unsplash

Machine Learning

How can we classify an instance as multiple classes?

We are very familiar with the single-label classification problems. We mostly come across binary and multiclass classifications. But, with the increasing applications of machine learning, we face different problems like movie genre classifications, medical report classification, and text classification according to some given topics. These problems can’t be addressed using single-label classifiers, as an instance may belong to several classes or labels at the same time. For instance, a movie can be of Action and Adventure genre at the same time. This is where multilabel classification steps in. …


Image for post
Image for post
Photo by Brett Jordan on Unsplash

How can we use deep learning to summarize the text?

This is my second article on Text summarization. In my first article, I have talked about the extractive approaches to summarize text and the metrics used. In this article, we are going to talk about abstractive summarization. We are going to see how deep learning can be used to summarize the text. So, let’s dive in.

Abstractive Summarizers

Abstractive summarizers are so-called because they do not select sentences from the originally given text passage to create the summary. Instead, they produce a paraphrasing of the main contents of the given text, using a vocabulary set different from the original document. This is very similar to what we as humans do, to summarize. We create a semantic representation of the document in our brains. We then pick words from our general vocabulary (the words we commonly use) that fit in the semantics, to create a short summary that represents all the points of the actual document. As you may notice, developing this kind of summarizer may be difficult as they would need the Natural Language Generation. …

About

Abhijit Roy

I am an undergrad student of Computer Science and Technology at the NIT, Durgapur. Find Me at https://abhijitroy1998.wixsite.com/abhijitcv

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store