What is Data Science, Really?

Sanket Sarang
World AI
Published in
5 min readApr 18, 2021

Everyone I speak with want’s to be a Data Scientist. When I hear someone say I want to become a Data Scientist, I often ask Why? And the answer I get is often the same; because it is the next big thing. Umm.. is it now really?

Data Science was not created as a profession today. We have been doing Data Science ever since Data was created. Now, if you say, Data was created when computers were created, so Data Science started along with computers; then wrong again! Data was created long before computers were created.

A classic example of Data Science in action is astrology. It was a field invented and mastered long before computers, and it is but what you would call Data Science. Now you see; I don’t believe in astrology; and you may not either. So an argument would be that something that potentially does not work, or something that’s just plain vanilla bogus is not Data Science? If you think this, then that is where you are wrong. Data Science is not about getting results. It is not about astrology making an accurate prediction. It is about the process of how you arrived at a prediction. If you predicted based on some science that you thought makes sense given the data (in case of astrology, it is the star positions), then you have done Data Science. Yes, your science maybe shitty. It may be totally incorrect. It may not work at all. But you have done Data Science!

So what is Data Science?

The process of working on data, computer generated or not, to derive either descriptive or predictive outcomes basis that data; is Data Science.

This is how I would define Data Science. It’s simple. It requires Data, you need to work on the Data, and your work needs to derive some outcome basis the Data. If you do this, then you are already doing Data Science.

So what does that mean? Essentially, all of us are Data Scientists! Have you worked on an Excel before? If so, you have done Data Science. That report you just submitted to your boss; the chart that went onto that presentation slide; it is all Data Science! Yes, simple charts are also Data Science. In fact Data Visualisation is a subset of Data Science. All charts are the process of presenting data in a specific matter so as for the viewer to be able to derive certain insights from it. The very process of making that Data better presentable is what we call Data Science.

Over the recent years; more specifically the last 5 years; we have introduced some highly capable Data Science frameworks for advanced analytics. Namely Artificial Intelligence & Machine Learning.

Now keep in mind, AI & ML were not created 5 years ago. In fact, the first AI system was created by Alan Turing in the 1940’s; long before some of our modern use computers and programming languages were invented. And yes mind you; AI & ML are all a part of Data Science. Both AI & ML are processes that use data for prescriptive / predictive analytics; and that is but Data Science.

So what is modern Data Science? If I may call it that. Or in other words, why is everyone talking about Data Science in the last 5 years?

AI & ML are a subset of Data Science

The field of Data Science has seen tremendous enhancements in capabilities of doing complex analysis. This growth of Data Science is mainly fuelled by the availability of computing power. The CPU & GPU in an average computer; or even in a smart phone; is now enough to run complex AI & ML models. This has lead to several companies and open source developers in creating more complex and yet easy to use AI & ML frameworks. Doing Data Science today is as simple as working on an Excel document.

AI & ML are very computationally intensive. Until 5 years ago, the cost of training an AI model was more expensive than the business value you would derive with having an accurate forecast. But when the cost of computing decreased, the need for AI increased. Today, almost every company want’s to do AI. Because computing power; specially on the Cloud; is now so cheap; has led to most companies believing that cost of doing AI will be lesser than the value they gain by having AI.

So modern day Data Science is all about solving complex problems. They are mainly complex because either the data is too large, or critical data is unavailable, or the data to outcome relationship is not very simple.

Today companies want to use AI for various reasons:

  1. Banking — Fraud detection, product recommendations
  2. Manufacturing — Predictive maintenance, process optimisation
  3. eCommerce — Recommendations, dynamic competitive pricing
  4. Retail — Shelf positioning, discount structures, loyalty programs
  5. Freight & Cargo — Route optimisation, ship dock movement optimisation

Several of these examples require severals years of data to be analysed, to solve the specific business problem. Most business problems can be attributed to more sales, reduced costs, and process optimisations. These are the 3 things for which businesses mainly use AI; and it is but nothing other than Data Science.

So to summarise, we have been doing Data Science since the time we have had Data. Over the last 5 years, Data Science industry is booming only because cost of computing significantly decreased. Using Data Science to better every business decision is now justified, as the cost of doing Data Science is lesser than the business gains from the output of Data Science.

So in short, we have always been doing Data Science, we now doing more of it than ever before. Wether you use complex tools like Python / R; or Neural Network frameworks like TensorFlow, or get into Natural Language Processing or Deep Learning; it does not matter. Use any framework you like, as long as you get the desired outcome.

One thing is for sure; more and more companies are doing Data Science now. I only see this increasing over the next decade. So whether you are a manager, a techie, CXO or the CEO of the company; learning Data Science is a must. It is the only thing that is going to help you take better business decisions.

--

--

Sanket Sarang
World AI

Founder, BlobCity.com | Creator of BlobCity AutoAI, BlobCity AI Cloud & BlobCity DB