The most powerful car engine in the world on the floor of a garage has a top speed of 0mph. Data science projects are no different.
Take a churn prediction engine — a core pillar of data science that when delivered correctly, has the potential to deliver millions of pounds in revenue and cost savings.
To unlock this bounty, two things need to happen:
- The engine itself has to be well designed and engineered
- It needs to be placed inside a product that perfectly compliments the output of the engine
The first point is well documented — there are hundreds of excellent blogs posts, courses, and tools that build machine learning models to predict a given response.
The second point unlocks the value. And incredibly, it’s hardly ever talked about.
Here are the 5 rules for delivering true product-driven data science
1. Think backwards
The day after the project finishes, somebody will be using the output from your model. Who is using it? What does that look like? When do they use it? Why are they using it?
Never lose sight of the answers to these four questions — they drive everything you do. Before you start doing anything with the data, map out the ‘endgame’ scenario where the solution is deployed and fully operational. One this target is fixed, you can start to work backwards to understand the tools, technologies and people that need to be involved to deliver the project.
2. Build a data pipeline before a model
A model without a data pipeline delivers no value. A data pipeline without a model delivers a benchmark value that can be built upon.
Don’t spend months tuning a model locally before thinking about how the output is surfaced to your internal teams. It’s amazing how often something simple can deliver amazing value, even before applying game-changing machine learning techniques.
Taking a product-driven approach ensures you can start delivering value straight away. Build a flow of data from source to insight first, so that end users are involved from the start of the project and see a product that is growing and improving.
3. Deliver actions over accuracy
The success of a model is defined solely by how it affects the actions taken by your company. Even if the model is almost perfectly accurate, it only adds value if it has a ‘so-what’ outcome.
Here’s an example. Two data scientists show you two different models for churn prediction. The first is 100% accurate — it tells you exactly who will churn next week — but is so complex, that there is no way to tell why each predicted customer will churn. The second is less accurate, but is immediately actionable by your marketing team because it tells you the factors that are influencing their decision to leave.
Which would you choose?
A product-driven approach to data science says the second model wins every time. A prediction is just a number — on its own, it delivers no value. Only actions delivered through data products deliver value.
4. Modularise and abstract
The best products have well defined user-flows, modules and messaging to ensure the best possible experience for the user. Data science projects should be delivered in the same way.
Ensure one day a week is spent refactoring and tidying code, in the same way that your designers would spend time tweaking the product to improve user experience.
Write code and documentation in modules so that each function only performs one job. Nobody likes products that contains superfluous buttons and the same is true for code with unnecessary complexity.
Create a technical README describing how the solution works at a high level. Automate everything. Log everything. You’ll thank yourselves in a year’s time when you come to upgrade the codebase.
5. Brand the solution
People don’t just buy trainers, they buy Nikes. Products that are branded correctly stick around. With data science, your model is a product with a user base and a purpose and therefore it also deserves its own branding.
This can be something simple like a memorable name and logo. Or if it’s a suite of dashboards, make sure they’re all styled in the same way and navigation between them is seamless.
It’s imperative that data scientists work closely with designers and the whole project team to ensure the model is embedded into business as usual processes. It takes time to sell a project into the business after completion. But if it’s branded correctly, it will sell itself.
These are the five rules that will ensure your data science project is product-driven. Get these right and your data science project will flourish every time.
This is the blog of Applied Data Science, a consultancy that develops innovative data science solutions for businesses. To learn more, feel free to get in touch through our website.