Consultant | Web & Data Science Instructor | My YouTube channel on Data Science: https://www.youtube.com/c/DataSciencewithHarshit

A guide to discovering normal distribution and calculating key estimates of location & variability with Python

Image for post
Image for post
Photo by Jason Coudriet on Unsplash

This is the second blog in the Stats series after explaining the taxonomy of data in the first blog. Here, we’ll learn to apply a few essential foundational concepts that help us describe the data using a set of statistical methods.

A sample is a snapshot of data from a larger dataset; this larger dataset which is all of the data that could be possibly collected is called population. In statistics, the population is a broad, defined, and often theoretical set of all possible observations that are generated from an experiment or from a domain.

These observations in the sample dataset often fit a certain kind of distribution which is commonly called the normal distribution and formally called Gaussian distribution. It is the most studied distribution because of which there is a subfield of statistics simply dedicated to Gaussian data. …


Understanding the taxonomy of data types

Image for post
Image for post
Photo by Isaac Smith on Unsplash

After making the need for statistics in data science apparent in my previous blog, it’s time to dive right in and get hands-on with understanding the statistical methods. This is going to be a series of blog posts and videos(on my YT channel).

Who are You?

I am starting this series of blogs on statistics and probability to help all the coders and analysts understand these concepts and methods. You are someone who is familiar with Python programming and is trying to get a better grip over statistics to master Data Science skills.

Growth of Data & Data Analysis

We know that Data Analysis has evolved beyond its original expected extent and this has happened because of the rapid development of technology, generation of more and bigger data, aggressive usage of quantitative analysis across a variety of disciplines. …


A monthly webcast where I share five cool projects worth spending time on!

Image for post
Image for post

The world of AI and Data Science is accelerating at an alarming rate. It becomes very hard for AI enthusiasts and learners to keep abreast of meaningful advances in the field. Applications, Research & Development, individual projects, proprietary software — every sector is applying DS and AI in its own remarkable way.

There are two main reasons I am starting this monthly AI webcast:

  1. This approach has helped me spot the trend in this broad domain of AI which reassures me of the direction I am moving in. So, why not share it with others!
  2. It fills me up with energy and gratitude that I am a part of this community. It also implies that I have a responsibility towards others who are struggling to showcase their talent and hard work to a wider audience. …


Answering important questions by transforming data into insights with Statistics

Image for post
Image for post
Photo by Bradley Dunn on Unsplash

In this hyper-connected world, data is being generated and consumed at an unprecedented pace. As much as we enjoy this superconductivity of data, it invites abuse as well. Data professionals need to be trained in using statistical methods not only to interpret numbers but to uncover such abuse and protect us from being misled.

Not many data scientists are formally trained in statistics and there are very few good books and courses that offer to learn these statistical methods from a data science perspective.

Through this post, I intend to shed some light on

  • Why statistics — common misconceptions regarding learning…


Comparing the newly launched features of Python 3.9 with Python 3.8

Image for post
Image for post

Python 3.9.0 — the latest stable release of Python is out!

Open-source enthusiasts from all over the world have been working on new, enhanced, and deprecated features for the past year. Though the beta versions were being rolled out for quite some time, the official release of Python 3.9.0 happened on October 5, 2020.

The official documentation contains all the details of the latest features and changelog. Through this post, I’ll walk you through a few cool features that may come in handy our day-to-day programming tasks.

We’ll check out the following:

  • Type hinting generics and flexible function and variable annotations
  • Union Operators in…


Basics of financial analysis and quantitative trading with Python.

Image for post
Image for post

Finance represents a system of capital, business models, investments, and other financial instruments. A very important sector of finance is trading. You can trade financial securities, equities, or tangible products like gold or oil.

Algorithmic or Quantitative trading is the process of designing and developing trading strategies based on mathematical and statistical analyses. It is an immensely sophisticated area of finance.

This tutorial serves as the beginner’s guide to quantitative trading with Python. You’ll find this post very helpful if you are:

  1. A student or someone aiming to become a quantitative analyst (quant) at a fund or bank.
  2. Someone who is planning to start your own quantitative trading business. …


A getting started guide to develop computer vision application with fastai

Image for post
Image for post
Unsplash

Deep learning is inducing revolutionary changes across many disciplines. It is also becoming more accessible to domain experts and AI enthusiasts with the advent of libraries like TensorFlow, PyTorch, and now fastai.

With the mission of democratizing deep learning, fastai is a research institute dedicated to helping everyone from a beginner level coder to a proficient deep learning practitioner to achieve world-class results with state-of-the-art models and techniques from the latest research in the field.

Goal

This blog post will walk you through the process of developing a dog classifier using fastai. The goal is to learn how easy it is to get started with deep learning models and be able to achieve near-perfect results with a limited amount of data using pre-trained models. …


A basic implementation guide to get started with Neural Networks using TensorFlow.

Image for post
Image for post

Disclaimer!

This series doesn’t explain the underlying mathematics of the algorithms but only focuses on the logical implementation and reasoning for using specific algorithms with a certain set of parameters. The resources to learn the basics of neural networks and underlying mathematics are covered in my blog How I passed the TensorFlow Developer Certificate Exam.

Introduction to Computer Vision

It’s fairly recent when the computers were finally able to perform seemingly trivial tasks of detecting objects/organisms in images or even recognize spoken words.

The more important question is why are these tasks so trivial to humans?
The short answer is that our consciousness lacks the ability to understand this perception that utilizes specialized visual, auditory, and other sensory modules in the brain. It is so fast that the high-level features of an image, video, or audio are already amplified by the time the sensory information meets our consciousness. …


Mathematical principles that underpin the regularization methods in Machine Learning

Image for post
Image for post

In the Linear Algebra Series, to give you a quick recap, we’ve learned what are vectors, matrices & tensors, how to calculate dot product to solve systems of linear equations, and what are identity and inverse matrices.

Continuing the series, the next very important topic is Vector Norms.

So,

What are Vector Norms?

Vector Norms are any functions that map a vector to a positive value which is the magnitude of the vector or the length of the vector. Now, there are different functions that offer us different ways to calculate vector lengths.

That’s okay but why are we studying this and what does this vector length represent…? …


A comprehensive list of data repositories for every type of problem

Image for post
Image for post
From Unsplash

Given the nature of my job, I have to work on new projects every week solving a different problem. My work requires me to parse through a lot of different kinds of datasets to design and develop instructions for Data Science aspirants.

The blog contains a few useful datasets and data repositories categorized in different classes of problems and industries.

Data Repositories on the web:

Image for post
Image for post
Google Dataset Portal
  • Google Dataset Search — a search engine for researchers to locate online data.
  • datasetlist — offers a list of the biggest machine learning datasets from across the web.
  • UCI — one of the oldest repositories with data classified by types of problems, attributes type, data type, the field of study, etc. …

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store