Machine Learning in the business context — how to find a viable project

CRISP DM methodology

Business and Data Understanding

The first step in any predictive analytics project is to understand what you are trying to solve.

Assessing if your business problem is solvable is highly coupled with the data at your disposal. Understanding the strengths and limitations of the data is important because rarely is there an exact match with the problem.

The stored data is only a condensed version of the business’ reality. There is so much context that’s impossible to store in a database, and therefore not all problems are suited for predictive analytics. Remember that you are trying to create a model, i.e. a mathematical representation of the “business reality”. It’s then best to find problems where there exists data to answer them.

Ideally, the business problem should be so well defined that you can deliver your technical team a model input file, describing the target and all contributing data to that target, along with examples of them.

Example of how on what level you need data structured to do machine learning and predict a target. See

Data preparation and Modeling

The data preparation and modeling steps are most probably where the most of the project’s time will be spent. It is essential to understand that these steps resemble closer to research and development than to engineering.

  • Is the data used drawn from a population similar to which the model will be applied? How does this impact the business statement?
  • Which type of modeling technique is being used? Does it meet other requirements of the project: Generalization performance, comprehensibility, speed of the application, amount of data required?
  • Should various models be tested and compared?

Evaluation

There are many ways a model’s performance can be evaluated; often the metric accuracy is used when describing how good a model is.

Even if your business doesn’t use a ML model for a given problem, there is a model in effect — albeit a very naive one.

From the Telco example above, the naive model could be that they belive the 3% of customers who churn is spread out evenly, i.e. the customers do not have anything in common. The action would then be to randomly incentivise customers to remain. And it is this model and its costs the ML model would need to be compared and evaluated against.

Deployment

When a model has passed all necessary evaluations and is deployed in your business application and makes predictions, the important thing now is to use the action for something.

In summary

When findig a problem suitible for a ML project the most important thing is to frame the problem on a row-level. Descibe the data needed to solve for the target, ideally providing examples of the data points needed. Then it’s easier to discuss with the technical team the viability, what data is in place and if something needs to be collected.

Problem specification guidelines

  • Precision — what is the question?
    - Does it stand on its own?
    - Decompose until one question remains
    - Can the problem be operationalized?
  • Value — Does this problem matter?
    - What’s the baseline?
    - What’s the result of the status quo?
  • Viability — If you find anything, can you act on it?
    - Are there potential solutions?
    - Can you execute or argue for implementation?

Reflection questions

  • Target — what is operationalized and modeled?
  • Input data — what data is used?
  • Business action — what’s done with the target?
  • Business objective — how can the action help the business?
  • Workflow change — How do the organization use the prediction; what changes?
  • Potential risks — Are there risks involved with collecting data, modelling the target, or using the prediction
    - Ethical / Trust — How is the business affected from poor predictions
  • Results — what actually happens once the model is deployed?

Thanks for Reading!

Interested in seeing how a real problem has been identified, modeled and evaluated? Look no further than here Vegvarsel.no — a ML approach to avoid mountain road closures.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store