Price Elasticity of Demand with a Simple Linear Regression (Part II)

Ileana Cabada
Geek Culture
Published in
7 min readMay 23, 2021
Image 1 by the Author

“How sensitive is Sales Demand, if product price changes?” “How much my sales will increase, if I lower my price?”

Part 1 of the article provides you an easy explanation of Price elasticities by giving you detail explanation of what is price elasticity and how you can interpret the data. In case, you want to review the concepts, I would suggest to click the following link

Part 2 of the article, which it is the one you are reading now. It gives you full coding and explanation of how to build your own price elasticity model by using Linear Regression in Python.

So, let’s begin the fun part :)

Data Collection:

What data to collect?

We know that without data there is no model, the first and very important part is to know what data you would need.

This model would analyze the price elasticities of several electronic products in one sole category “Laptop, computer” for an e-commerce platform called “Bestjam.com”. Haven’t hear about the platform? no worries, nobody has ever hear about it :D, this is because the data is completely made up for this model.

In order to analyze the price elasticities, we need to have products where they have historical price and quantity sold data (sales demand) , the price need at least to have more than one price variations of the product price with their respective quantity sold (sales demand) in order to be able to observe the sales demand reaction towards the product price changes.

Python libraries used are:

We would use following python libraries: pandas, numpy, matplotlib and statsmodels.

Here, I imported pandas, numpy and matplotlib on the beginning but in the following steps you will see the most important from all, which it is statsmodels used mainly for the Linear Regression model.

Table 1

In our data frame as you see above, we have what we need to make our model. The main metrics that you need are product name, price and sales count, which reflects the quantity of sales. The other features are additional to the analysis that I personally considered important. Following, I explain the reasoning behind each feature.

Data Frame Features explained:

Date_imp: Date time impression when the data was captured

Category_name: Category name is a feature included in order to analyze prices in sole groups such as Laptops, Speakers and others

name: Name of the product

disc_price: Price of the product. In case the product has a discount, the price value included is with the price already discounted

merchant: merchant is the platform that the product is located. As a recommendation, analyze product prices from a sole platform due that consumer traffic might vary from platform to platform and the price elasticity analysis might not be very accurate

condition: product condition. As a recommendation, analyze product prices from the same condition. Otherwise, this might affect your results due that product prices with condition as “used” are usually cheaper than new product prices.

Sales_count: sales count is the sales demand of the products with their respective prices

From the data, only one merchant was selected as mentioned above in order to not have any incongruences, it might be that other merchants have more user traffic (market share) than the others and as a result our price elasticities can be altered. We choose “Bestjam.com” e-commerce platform, and condition of the product as “New”. Then we checked all category values that we have.

Figure 1

As per see, “Laptop, computer” category is the biggest category. For this reason in this occasion, we would analyze the price elasticities of this category.

Table 2

Now that we filtered our data to “Laptop, computer” category and condition as “New” , we select following columns in our data frame for analysis : Date_imp, name, disc_price and Sales_count.

Data Preparation:

For building the linear regression modeling, we set “Sales_count”(y value) as our dependent variable and as an independent variable “disc_price”(x value).

Due that the purpose of this model is to estimate the price elasticities of multiple products in one category. It captured the entire “x_values”, which are the product prices within its date impression and name, and the entire “y_values”, which are the quantity sold of the products within its date impression and name in two separated data frames, formatted as follows:

Independent variables (disc_price) as x_values:

Table 3

Dependent variable (Sales Count) as y_values:

Table 4

Now, that the data is formatted in the correct manner for our Linear Regression. Let’s build our Linear Regression model!

Price Elasticity Model using Linear Regression:

As we might know from the Part 1 of this article, price elasticity calculation is the following:

PED:

Percentage change in Quantity (Sales Demand) / percentage change in Price

which it is rewritten in the use of Linear Regression as follows:

PED in Linear Regression:

Coefficient (Slope) * Price mean / Quantity mean

As mentioned for the Linear Regression, statsmodels library is used. Below you see that in the for loop, data was transformed and appended into several data frames with x values and y values per product name. For later, fitting the Linear Regression with each product data frame and their respective x and y values.

Table 5

Further, there is an introduction to each metric calculated for you to take it into account when you analyze or build your own price analysis.

Note: I heartily suggest you to check the t-stats and p-values from your price elasticities.

Final Data Frame Metrics Explanation:

price_elasticity: price elasticity per product

t_score: equals to t-stats, which indicates the significance of the findings. The closer is to 0, the more likely there is no significance. The greater the t-stats, whether positive or negative, the greater the significance of the findings

coefficient_pvalue: tests the null hypothesis that the coefficient is equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis

slope (coefficient):is the ratio of the “price change” to the “quantity sold change” between (any) two distinct points on a line

price_mean: average price per product

quantity_mean: average quantity sold per product

intercept: equals the quantity of X when all other variables = 0

Data Visualization: Divergent Plot for Price Elasticities:

As a bonus, I am adding a data visualization function that it would help you to present your price elasticities and how you can interpret the data from your visualization :)

Above, this built-in function would help you for the visualization of price elasticities. Divergent plot is used because it gives the reader a more clear overview between negative and positive price elasticities for price analysis :).

As per see from Part.1 of Price Elasticity of Demand using Linear Regression in Python:

Elasticities between 0- 1 are considered inelastic, as a result the products from rank 9 to 15 are inelastic.

Negative Price Elasticities are located from rank 1 to 8 and this would tell us the likelihood of when price decreases how likely sales demand would increase or vice versa.

For instance, let’s analyze Lenovo Flex located in rank 5 with a negative price elasticity of -2.22:

A 10% price decrease in Lenovo Flex, it increases sales demand by 22.2% or a 10% price increase in Lenovo Flex, it decreases sales demand by 22.2%

Now you go it, congratulations to come this far! :)

Thanks for reading!

If you want to continue learning more about price strategies and how to spot your price competitors, I recommend you following article to read:

--

--