Solving Data Science Problem
There are many different ways and a lot of algorithms to solve a data science problem, but there is a certain workflow that almost every data scientist goes through in common. I will introduce the process of solving data science problem.
1. Define a problem
Data scientists takes the real world problem to data science problem. For example, if a business company requests to recommend how to improve their sales, data scientist bring it to a data science problem; “is there a relationship between the sales and other factors?”, “how is the sales and season related?”, “how big is the impact of seasonaility?”.
2. Gather data
After re-defining real world problem to a data science problem, data scientists gather data. If there is a given data, they also check if there is any other useful data sources.
3. Explore data
Before diving into machine learning, data scientists to exploratory data analysis (EDA). Data itself might have some important information and meaningful analysis without any machine learning! Checking correlation or distribution of each feature helps understand the data.
4. Model with data
Now, it’s the time to model the data. Data scientists check whether the data has target or not to determine supervised learning or unsupervised learning. If it is supervised learning, they check if it is regression or classification problem and apply the best model according to the given data.
If it is unsupervised learning, they will try to cluster and see how the groups of observations are related and separated.
5. Evaluate model
According to the model they choose, they evaluate the model. There are several metrics they can use to evaluate; accuracy, mean square error, ROC, etc. If the model performs poor, data scientists will try to adjust some parameters or just go back to choose different model and evaluate again.
6. Answer problem
Data scientists interpret the data with the performance of the model they chose and evaluated and bring the results back to the real world solution to answer the problem.
Even though I enumerated the steps of solving data science problem as something ambitious, it is not much different from fixing a light bulb!
Face the problem, gather some information, and find the best way to solve it and test if it works and you can also be a problem solver! :)