Why is cost-benefit analysis important before starting any data science project?

Shivanshu Aggarwal
4 min readAug 23, 2022

--

Why is cost-benefit analysis important in any data science project?

Let’s look at a case study.

A device remanufacturing factory has a new ML model which helps to skip tests on the main component of the device- the motherboard.

If the motherboard is faulty, the ML model can help in saving the time spent on the test and it can be directly scrapped. Expected benefits would be higher throughput and lower labour costs per device, among many others.

With a precision of 78% and a recall of 70%, the client provided the go-ahead for the deployment of this ML model. This eventually means altering the factory setup by changing the steps, conveyor belts, and training of handling staff.

After 2 months in production, this model worked as anticipated. But the client was upset, worried, and stressed. This ML model was expected to bring more dollars to the table, but this didn’t happen. The client was at a loss.

The question arises, why is this happening ?

The client was confident with the model before and now he is getting sleepless nights.

After two months, the client realised the actual cost-benefit analysis was not done before. And, for any correct prediction, the ML model is helping to save $100, but for every incorrect prediction, there is a loss of $400.

For every incorrect prediction, it requires four correct predictions to reach the break-even point.

Now, why this huge difference between the two?

The motherboard is an expensive component. If you scrap a good one after an ML model predicts a faulty one, it needs to be replaced with another one, which costs much more money than ML helps to save on a device.

Technically, for this problem, a precision of 80% is the bare minimum requirement. The recall doesn’t even matter if precision is less than 80%; there will be a definite loss. In fact, the greater the recall, the greater the loss.

This may sound inaccurate.

Let’s take an example to understand this better.

Let’s assume there are 1000 faulty motherboards. The below table has various combinations of recall and precision values to understand various scenarios.

1. When precision is below the break-even value

Having the same precision and more recall means more correct predictions, but this will keep the same ratio of correct to incorrect predictions as the precision is the same, which means more incorrect predictions and hence more loss of money.

Conversely, having the same recall but lower precision means a higher number of incorrect predictions but the same number of correct predictions, which lowers the ratio of correct to incorrect predictions, which means much more loss of money.

2. When precision is above the break-even value

Having the same precision means the same ratio of correct to incorrect predictions. Any percentage increase in the recall will bring the same percentage increase in the net savings.

Conversely, with any increase in precision, the ratio of correct to incorrect predictions increases accordingly, which means a reduction in loss due to incorrect predictions, hence much more net savings.

Having a cost-benefit analysis is very very vital before bringing any idea to the ground.

Starting steps of Cost-benefit analysis could be:

1. Identify the goals and objectives you’re trying to address with the proposal. What do you need to accomplish to consider the endeavour a success?

This can help you identify and understand your costs and benefits, and will be critical in interpreting the results of your analysis.

2. Determine costs and benefits

The next step is to create two lists- One has the projected costs, and the other has the expected benefits of the proposed project.

Cost not just includes direct costs, which include expenses directly related to the production or development of the solution, but also Indirect costs, Intangible costs and Opportunity costs.

Once those individual costs are identified, it’s equally important to understand the possible benefits of the proposed decision or project. Some of those benefits include:

Direct: Increased throughput and revenue

Indirect: Increased customer satisfaction and loyalty to your brand

Intangible: Improved employee morale

Though it is almost impossible to quantify all the variables, Cost-benefit analysis helps to make better decisions, uncover hidden costs and benefits and more importantly avoid human bias.

Cost-benefit analyses are guides to good decisions. But good decisions don’t come from blindly following numbers. They come from careful thinking.

In the end, cost-benefit analysis shouldn’t be the only business strategy you use in determining how to move your organization into the future. Cost-benefit analysis isn’t the only type of economic analysis you can do to assess your business’s economic state, but a single option at your disposal.

#datascience #machinelearning #costbenefit #artificalintelligence #ai

--

--