Crisp-DM suggests steps which will be iteratively implemented to have final model in production. Inside modeling there are some steps which is followed to and fro to get ready with your model. Once the model is ready, then only it can move to evaluation steps.
Phases of Crisp-DM
1. Business Understanding
As a name suggests, we should now the problem statement. And, we should have SME for the problem to guide.
2. Data Understanding
Once the problem statement is clear, most important phase starts with collecting data. When we collect data, we have to identify whether it correlates to our porblem statement or not.
3. Data preparation
Data preparation includes cleansing and transforming. You apply various techniques to finalize features like PCA / SVD etc. For PCA understanding, you can go through the below:
PCA — Eigenvalue and Eigenvector
PCA Stands for principal component analysis. It helps in dimensionality reduction or feature engineering. Below is the…
Once our data is prepared and ready, time has come to put this data to train the model. Here we train the model, generate the model and save the model for further usage.
Evaluating the model is very important step. Here, we decide that trained model is good enough to prepare to go to production. We use different evaluation techniques to understand whether our model is good enough to be live. Confusion matrix is one of them, you can increase your understanding on confusion matrix and some of the important formulas to evaluate model:
Confusion Matrix : Simplified
Confusion matrix is one of the tables generated to visualise performance of algorithm or model. Learning confusion…
Finally, deployment to the production environment and maintenance will start. And, we end up reiterating steps from 3–6 over the period to outperform our old model with new model.
As we have overview of Crisp-DM phase, now time to go through the most important phase i.e, modeling. In simple words, we can say that there are set phases in Crisp-DM and inside this, one of the important steps is to do train the model. In my view, data preparation and train the model are both the backbone of data mining. So, I would like to brief down the steps or iterative steps to be followed to come up with more matured model to generalize better. So, we generally perform below steps iteratively to have final model prepared to be used for test data:
- Initialization the weights
It could be any random initialization and training based on the data will take care to come up with the finalized weights required to generalize test data better.
- Forward Propagation
a. Sum the production: Multiply weight vector with input vector
b. Put the sum through activation function e.g, sigmoid
- Backward Propagation
a. Compute the errors, i.e. difference between expected output and predictions
b. Multiply the error with the derivatives to get the delta
c. Multiply the delta vector with the inputs, sum the product
- Optimizer takes a step
a. Multiply the learning rate with the output of step 3c
- Repeat Step 1–4, till it is desired or converge
So, above are the phases of Crisp-DM and then the steps followed to improve the model to converge. In my view, forward and backward propagation requires little more depth knowledge to understand the process of training the model. And, both are so important that they require a separate attention and I will come up with a blog for the same. In the meantime, you apply the above into practice and see the principles help to prove yourself better or not. Happy Coding!