What is Data Science

PK Banks
data-science-machine-learning-101
2 min readDec 23, 2016

I was enjoying dinner with friends when someone asked me, “What is data science? What do you actually do?” These are questions worth asking, so let’s discuss them.

What is Data Science and why does it matter?

Data science asks questions and seeks to answer them by observing the world and finding meaningful relationships between conditions and elements that contribute to outcomes.

Results from our study enable us to make more informed decisions with greater degrees of confidence.

Data science can be applied to any circumstance that involves observing and gathering information for the purpose of making better choices or learning about how things work together. Whether we want to predict the effects of climate change or predict outcomes in financial markets, data science provides the tools and frameworks for using the past to better predict the future.

Statistics is our domain, while our basic materials are data. The computer is our toolbox. The finished results are predictions about the future and the decisions to be prepared for the future.

There is a vast sea of technical jargon. Common elements of data science include data gathering, hypothesis testing, data visualization, linear regression, optimization, simulations, and algorithms. Data science is often referenced by other names, such as data analytics, big data, bots, business intelligence, machine learning, and artificial intelligence. While these terms are distinct, they are often used interchangeably. The abundance of terminology can obstruct, intimidate, or obfuscate. When we lose our way it’s often helpful to come back to the beginning: we use data to learn something about the world and make well-informed choices.

• What does a data scientist actually do?
Data science requires the collection, examination, and accessibility of data. A data scientist needs to feel comfortable with data. They need to know where the data comes from, how it was collected, and the extent to which the data is reliable.

Data science requires computations. Learning from the data can be computationally intensive. The operations tend to be basic arithmetic, but they also tend to be immense because data science pays a lot of attention to combinations of inputs that contribute to an outcome. The number of possible combinations get very big, very fast. If we cannot make computations quickly enough, we lose the ability to act on what we learn in time.

Now that we have discussed what data science is and what a data scientist does, let’s proceed with some of the details.

In the next episode, we’ll start with the domain of data science: statistics.

--

--