Understanding CRISP-DM and its importance in Data Science projects
A quick overview of the CRISP-DM. This is part 1 of the 7-part series’ summary explanation of the openSAP’s 6-week Getting Started with Data Science (Edition 2021) course by Stuart Clarke.
What is CRISP-DM?
CRISP-DM or CRoss Industry Standard Process for Data Mining is a process model with six phases that naturally describes the data science life cycle. It’s like a set of guardrails to help you plan, organize, and implement your data science (or machine learning) project.
Why is it important?
A good data science project must have a reliable and repeatable process for people with little data science background to follow and understand easily. This is where CRISP-DM comes in as you can use the CRISP-DM methodology as a template to ensure you have considered all of the different aspects specific to your project.
There are 6 phase of the CRISP-DM:
- Determine Business Objectives
- Assess Situation
- Determine Data Science Goals
- Produce Project Plan
- Collect Initial Data
- Describe Data
- Explore Data
- Verify Data Quality
- Select Data
- Clean Data
- Construct Data
- Integrate Data
- Format Data
4. Modeling
- Select Modeling Technique
- Generate Test Design
- Build Model
- Assess Model
5. Evaluation
- Evaluate Results
- Review Process
- Determine Next Steps
6. Deployment
- Plan Deployment
- Plan Monitoring & Maintenance
- Produce Final Report
- Review Project
In the next few weeks, I will be providing a summary explanation of each the phases. Each phase has its own task and its own projected output. I will also explain how it is applied for you to be able to understand why it is very important to follow a project methodology when working with a data science project.
CRISP-DM methodology is not required to be followed step-by-step as different data science projects will have different requirements. You can use the CRISP-DM methodology as a template to ensure you have considered all of the different aspects specific to your project.
To have a detailed explanation of the full course, enroll in the 6-week course at https://open.sap.com/courses/ds3.
References: