Solving Data Science Problem

Sung won (Chris) Lee
2 min readFeb 20, 2019

--

There are many different ways and a lot of algorithms to solve a data science problem, but there is a certain workflow that almost every data scientist goes through in common. I will introduce the process of solving data science problem.

1. Define a problem

Data scientists takes the real world problem to data science problem. For example, if a business company requests to recommend how to improve their sales, data scientist bring it to a data science problem; “is there a relationship between the sales and other factors?”, “how is the sales and season related?”, “how big is the impact of seasonaility?”.

2. Gather data

After re-defining real world problem to a data science problem, data scientists gather data. If there is a given data, they also check if there is any other useful data sources.

3. Explore data

Before diving into machine learning, data scientists to exploratory data analysis (EDA). Data itself might have some important information and meaningful analysis without any machine learning! Checking correlation or distribution of each feature helps understand the data.

4. Model with data

Now, it’s the time to model the data. Data scientists check whether the data has target or not to determine supervised learning or unsupervised learning. If it is supervised learning, they check if it is regression or classification problem and apply the best model according to the given data.

If it is unsupervised learning, they will try to cluster and see how the groups of observations are related and separated.

5. Evaluate model

According to the model they choose, they evaluate the model. There are several metrics they can use to evaluate; accuracy, mean square error, ROC, etc. If the model performs poor, data scientists will try to adjust some parameters or just go back to choose different model and evaluate again.

6. Answer problem

Data scientists interpret the data with the performance of the model they chose and evaluated and bring the results back to the real world solution to answer the problem.

Even though I enumerated the steps of solving data science problem as something ambitious, it is not much different from fixing a light bulb!

Face the problem, gather some information, and find the best way to solve it and test if it works and you can also be a problem solver! :)

--

--