What is The Definition of Data Science and why data science?

Minati biswal
3 min readAug 17, 2019

--

Before we go any further, let’s look at some basic definitions that we will use throughout this book. The great/awful thing about this field is that it is so young that these definitions can differ from textbook to newspaper to whitepaper.

Data science definition

The definitions that follow are general enough to be used in daily conversations and work to serve the purpose of the book, an introduction to the principles of data science.

Let’s start by defining what data is. This might seem like a silly first definition to have, but it is very important. Importance of Data science

Whenever we use the word “data”, we refer to a collection of information in either an organized or unorganized format:

• Organized data: This refers to data that is sorted into a row/column structure, where every row represents a single observation and the columns represent the characteristics of that observation.

• Unorganized data: This is the type of data that is in the free form, usually text or raw audio/signals that must be parsed further to become organized. Whenever you open Excel (or any other spreadsheet program), you are looking at a blank row/column structure waiting for organized data. These programs don’t do well with unorganized data.

For the most part, we will deal with organized data as it is the easiest to glean insight from, but we will not shy away from looking at raw text and methods of processing unorganized forms of data. Data science is the art and science of acquiring knowledge through data. What a small definition for such a big topic, and rightfully so! Data science covers so many things that it would take pages to list it all out .

Data science is all about how we take data, use it to acquire knowledge, and then use that knowledge to do the following:

• Make decisions

• Predict the future

• Understand the past/present

• Create new industries/products

The methods of data science, including how to process data, gather insights and use those insights to make informed decisions and predictions. Data science is about using data to gain new insights that you would otherwise have missed.

As an example, imagine you are sitting around a table with three other people. The four of you have to make a decision based on some data. There are four opinions to consider. You would use data science to bring a fifth, sixth, and even seventh opinion to the table.

That’s why data science won’t replace the human brain, but complement it, work alongside it. Data science should not be thought of as an end-all solution to our data woes; it is merely an opinion, a very informed opinion, but an opinion nonetheless. It deserves a seat at the table.

Why data science?

In this data age, it’s clear that we have a surplus of data. But why should that necessitate an entirely new set of vocabulary? What was wrong with our previous forms of analysis? For one, the sheer volume of data makes it impossible for a human to parse it in a reasonable time. Data is collected in various forms and from different sources and often comes in very unorganized. To learn data science Course

Data can be missing, incomplete, or just flat out wrong. Often, we have data on very different scales and that makes it tough to compare it. Consider that we are looking at data concerning pricing used cars. One characteristic of a car is the year it was made and another might be the number of miles on that car.

Once we clean our data (which we spend a great deal of time looking at in this book), the relationships between the data become more obvious, and the knowledge that was once buried deep in millions of rows of data simply pops out. One of the main goals of data science is to make explicit practices and procedures to discover and apply these relationships in the data.

--

--

Minati biswal

Tableau Trainer Having more then 3 years experience from Cynix IT Technology.