Introduction to Data Envelopment Analysis in R

Bhaskarjit Sarmah
Analytics Vidhya
Published in
5 min readSep 15, 2018
Image Source: Shutterstock

As a Data Scientist who helps businesses to make data-driven decisions, I always had the curiosity to know how would someone can implement a performance analysis of comparable units of an organization and make suggestions to business how they can improve the performance of less efficient units. After doing research on this topic, I came to know about an Operations Research technique called Data Envelopment Analysis.

Data Envelopment Analysis is a Performance Measurement technique which is used for comparing the performances of similar units of an organization. The units for which we are doing the performance analysis are called Decision Making Units (DMU). For example, we can compare all the McDonald’s outlets operating in the Delhi NCR Region to find out which outlet is doing good and which one is not and then recommend some actions to bad ones to perform better. DEA has wide applications in all industries including hospitals, banks, universities etc. This technique calculates the efficiencies of all DMUs by taking a set of input and output variables (which are generally the most important business metrics of the organization) and then set a benchmark. The most important advantage of this technique is that it can handle the multiple input and output variables which are generally not comparable to each other. DEA techniques are very popular in Operations Research and it uses concepts of Linear Programming to formulate and solve the problem at hand.

Let’s take an example

Now we know what DEA is, let us solve a problem to understand the concept behind DEA. Imagine yourself as the owner of ABC Stores, a chain of lifestyle retail stores in India having six outlets (here they will be called as DMUs) at Delhi, Mumbai, Bangalore, Chennai, Kolkata, and Hyderabad. You want to find out which outlet is efficient and which ones are not, and then benchmark the most efficient one to recommend improvements for the inefficient outlets. Just to make our life easier consider a 2 input and 2 output problem. Number of Employees and Management time/week as inputs and number of dresses sold/week, number of accessories sold/week as outputs

The following table shows the values for the above-mentioned input and output factors:

Fig 1: Dataset

Now the efficiency of each DMU can be calculated as follows

where u1, u2, v1, and v2 are respective weights of the output, input factors. But how to calculate these weights? To find out these weights we need to solve Linear Programming problems for each DMU (if we have n DMUs then we need to solve n Linear Programming problems)

Linear Programming Formulation

As we have six DMUs in this case, we need to solve 6 different LP problems. I will show you how to formulate LP for one DMU, let’s take DMU1 (Delhi) for example.

Objective Function:

Since our objective function is fractional, it is still not formulated as an LP problem. So we will make our denominator equal to 1 and treat it as a constraint. Modified LP problem will look like

Objective Function:

This is the final LP formulation for DMU1 (Delhi). Similarly, we need to formulate the LP for DMUs as well

Data Envelopment Analysis Implementation in R

There are numerous packages in R such as lpSolve, Benchmarking, FEAR to do DEA Analysis. In this example, I am using rDEA package

Please note that I have used the same dataset (dea) in the code below as shown in Fig 1.

Results and Interpretation

It is clearly evident from the above table, except Delhi and Hyderabad outlets remaining outlets are efficient. So what improvement should we recommend to Delhi and Hyderabad (inefficient ones) so that they can perform at par with the efficient outlets? This can be done by using shadow prices (lambda values from the above table, they are the variables related to the constraints limiting the efficiency of each unit to be no greater than 1). For the inefficient DMUs Delhi and Hyderabad, the benchmarks DMUs are Mumbai, Bangalore, Chennai, and Kolkata and their corresponding shadow prices are 0.6435, 0.0730, 0 and 0 respectively for Delhi. Therefore the recommendation for Delhi is as follows

Delhi DMU is overusing their number of employees by 9.25 units and also, they are giving 6.63 hours of Management time more than their efficient DMUs. So, they should reduce take the Number of Employees and Management Time/Week by that amount. The similar comparison can be done Hyderabad DMU.

Final Notes

DEA is a very powerful technique for performance measurement and widely used across the industry. Try this technique to find out efficiencies of any units that piques your interest. You can even find out performance analysis of IPL teams this season and recommend for improvement for next season.

--

--

Bhaskarjit Sarmah
Analytics Vidhya

A data science enthusiast and passionate about solving real world problems using data.