This is the second blog in the Stats series after explaining the taxonomy of data in the first blog. Here, we’ll learn to apply a few essential foundational concepts that help us describe the data using a set of statistical methods.
A sample is a snapshot of data from a larger dataset; this larger dataset which is all of the data that could be possibly collected is called population. In statistics, the population is a broad, defined, and often theoretical set of all possible observations that are generated from an experiment or from a domain.
These observations in the sample dataset often fit a certain kind of distribution which is commonly called the normal distribution and formally called Gaussian distribution. It is the most studied distribution because of which there is a subfield of statistics simply dedicated to Gaussian data. …
After making the need for statistics in data science apparent in my previous blog, it’s time to dive right in and get hands-on with understanding the statistical methods. This is going to be a series of blog posts and videos(on my YT channel).
I am starting this series of blogs on statistics and probability to help all the coders and analysts understand these concepts and methods. You are someone who is familiar with Python programming and is trying to get a better grip over statistics to master Data Science skills.
We know that Data Analysis has evolved beyond its original expected extent and this has happened because of the rapid development of technology, generation of more and bigger data, aggressive usage of quantitative analysis across a variety of disciplines. …
The world of AI and Data Science is accelerating at an alarming rate. It becomes very hard for AI enthusiasts and learners to keep abreast of meaningful advances in the field. Applications, Research & Development, individual projects, proprietary software — every sector is applying DS and AI in its own remarkable way.
There are two main reasons I am starting this monthly AI webcast:
In this hyper-connected world, data is being generated and consumed at an unprecedented pace. As much as we enjoy this superconductivity of data, it invites abuse as well. Data professionals need to be trained in using statistical methods not only to interpret numbers but to uncover such abuse and protect us from being misled.
Not many data scientists are formally trained in statistics and there are very few good books and courses that offer to learn these statistical methods from a data science perspective.
Through this post, I intend to shed some light on
Python 3.9.0 — the latest stable release of Python is out!
Open-source enthusiasts from all over the world have been working on new, enhanced, and deprecated features for the past year. Though the beta versions were being rolled out for quite some time, the official release of Python 3.9.0 happened on October 5, 2020.
The official documentation contains all the details of the latest features and changelog. Through this post, I’ll walk you through a few cool features that may come in handy our day-to-day programming tasks.
Finance represents a system of capital, business models, investments, and other financial instruments. A very important sector of finance is trading. You can trade financial securities, equities, or tangible products like gold or oil.
Algorithmic or Quantitative trading is the process of designing and developing trading strategies based on mathematical and statistical analyses. It is an immensely sophisticated area of finance.
This tutorial serves as the beginner’s guide to quantitative trading with Python. You’ll find this post very helpful if you are:
Deep learning is inducing revolutionary changes across many disciplines. It is also becoming more accessible to domain experts and AI enthusiasts with the advent of libraries like TensorFlow, PyTorch, and now fastai.
With the mission of democratizing deep learning, fastai is a research institute dedicated to helping everyone from a beginner level coder to a proficient deep learning practitioner to achieve world-class results with state-of-the-art models and techniques from the latest research in the field.
This blog post will walk you through the process of developing a dog classifier using fastai. The goal is to learn how easy it is to get started with deep learning models and be able to achieve near-perfect results with a limited amount of data using pre-trained models. …
This series doesn’t explain the underlying mathematics of the algorithms but only focuses on the logical implementation and reasoning for using specific algorithms with a certain set of parameters. The resources to learn the basics of neural networks and underlying mathematics are covered in my blog How I passed the TensorFlow Developer Certificate Exam.
It’s fairly recent when the computers were finally able to perform seemingly trivial tasks of detecting objects/organisms in images or even recognize spoken words.
The more important question is why are these tasks so trivial to humans?
The short answer is that our consciousness lacks the ability to understand this perception that utilizes specialized visual, auditory, and other sensory modules in the brain. It is so fast that the high-level features of an image, video, or audio are already amplified by the time the sensory information meets our consciousness. …
In the Linear Algebra Series, to give you a quick recap, we’ve learned what are vectors, matrices & tensors, how to calculate dot product to solve systems of linear equations, and what are identity and inverse matrices.
Continuing the series, the next very important topic is Vector Norms.
Vector Norms are any functions that map a vector to a positive value which is the magnitude of the vector or the length of the vector. Now, there are different functions that offer us different ways to calculate vector lengths.
That’s okay but why are we studying this and what does this vector length represent…? …
Given the nature of my job, I have to work on new projects every week solving a different problem. My work requires me to parse through a lot of different kinds of datasets to design and develop instructions for Data Science aspirants.
The blog contains a few useful datasets and data repositories categorized in different classes of problems and industries.