Distance metrics are a key part of several machine learning algorithms. These distance metrics are used in both supervised and unsupervised learning, generally to calculate the similarity between data points.

An effective distance metric improves the performance of our machine learning model, whether that’s for classification tasks or clustering.

Let’s say we want to create clusters using the K-Means Clustering or k-Nearest Neighbour algorithm to solve a classification or regression problem. How will you define the similarity between different observations here? How can we say that two points are similar to each other?

This will happen if their features are…


The trick to do well in deep learning hackathons (or frankly any data science hackathon) often comes down to feature engineering. How much creativity can you muster when you’re given data that simply isn’t enough to build a winning deep learning model?

I’m talking from my own experience of participating in multiple deep learning hackathons where we were given a dataset of a few hundred images — simply not enough to win or even finish in the top echelons of the leaderboard. So how can we deal with this problem?

The answer? Well, that lies deep in a data scientist’s…


I’ve spent the majority of the last two years working almost exclusively in the deep learning space. It’s been quite an experience — worked on multiple projects including image and video data related ones.

Before that, I was on the fringes — I skirted around deep learning concepts like object detection and face recognition — but didn’t take a deep dive until late 2017. I’ve come across a variety of challenges during this time. And I want to talk about four very common ones that most deep learning practitioners and enthusiasts face in their journey.

If you’ve worked in a…


I was working on a computer vision project last year where we had to build a robust face detection model. The concept behind that is fairly straightforward — it’s the execution part that always sticks in my mind.

Given the size of the dataset we had, building a model from scratch was a real challenge. It was going to be potentially time-consuming and a strain on the computational resources we had. We had to figure out a solution quickly because we were working with a tight deadline.

This is when the powerful concept of transfer learning came to our rescue…


I’m enthralled by the power and capability of neural networks. Almost every breakthrough happening in the machine learning and deep learning space right now has neural network models at its core.

This is especially prevalent in the field of computer vision. Neural networks have opened up possibilities of working with image data — whether that’s simple image classification or something more advanced like object detection. In short, it’s a goldmine for a data scientist like me!

Simple neural networks are always a good starting point when we’re solving an image classification problem using deep learning. …


Every once in a while, there comes a library or framework that reshapes and reimagines how we look at the field of deep learning. The remarkable progress a single framework can bring about never ceases to amaze me.

I can safely say PyTorch is on that list of deep learning libraries. It has helped accelerate the research that goes into deep learning models by making them computationally faster and less expensive (a data scientist’s dream!).

I’ve personally found PyTorch really useful for my work. …


I have written extensive articles and guides on how to build computer vision models using image data. Detecting objects in images, classifying those objects, generating labels from movie posters — there is so much we can do using computer vision and deep learning.

This time, I decided to turn my attention to the less-heralded aspect of computer vision — videos! We are consuming video content at an unprecedented pace. I feel this area of computer vision holds a lot of potential for data scientists.

I was curious about applying the same computer vision algorithms to video data. …


KNN and K-Means are one of the most commonly and widely used machine learning algorithms. KNN is a supervised learning algorithm and can be used to solve both classification as well as regression problems. K-Means, on the other hand, is an unsupervised learning algorithm which is widely used to cluster data into different groups.

One thing which is common in both these algorithms is that both KNN and K-Means are distance based algorithms. KNN chooses the k closest neighbors and then based on these neighbors, assigns a class (for classification problems) or predicts a value (for regression problems) for a…


In the computer vision field, one of the most common doubt which most of us have is what is the difference between image classification, object detection and image segmentation. When I started my journey in the computer vision field, I was also confused with these terms. So, I decided to break down these terminologies which will help you to understand the difference between each of them. Let’s start with understanding what is image classification:

Consider the below image:

You will have instantly recognized it. It’s a dog. Take a step back and analyze how you came to this conclusion. You…

Pulkit Sharma

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store