Contribution of gravity models in a business expansion plan

Marta Borrajo
Geoblink Tech blog
Published in
5 min readJan 20, 2022

Many approaches for sales predictive models exist. The evolution and accessibility of Machine Learning techniques throughout the data community has made these kind of models one of the scientists’ first choice.

However, other modelling techniques have shown a great performance, not just by their accuracy but also due to their explicability and their ability to correctly take into account effects such as cannibalization.

Among these other types we find gravity models. Over half a century ago, David.L.Huff, a distinguished academician in geography and marketing, demonstrated the significant contribution that gravity models techniques can have when solving retail network problems.

In this post we explain the fundamentals of this model and some generalizations and improvements developed in Geoblink.

The model

This model tries to determine the probability of a customer for store patronage. In other words, how likely is it that a customer will bring business to a store. This probability is nothing but the potential or impact of a store on a customer divided by the impact of all stores in the neighborhood on that customer. How we define this store’s impact is what gives this approach the gravity denomination!

In the same way a planet attracts an object with a potential which is proportional to its mass and decays with the distance, the store attracts people with a potential which is proportional to its attractiveness and also decays with the distance between them. Our ability to determine this attractiveness and the distance decay function are two of the three cornerstones of the model.

Potential created by a store

Traditionally, the store size is used as the attractiveness. At Geoblink, thanks to our variety of variables, we can enrich this definition and use a combination of different features instead. In this way, the attractiveness is determined not only by characteristics such as facade or size, but also by walking traffic in the street or the number of attractors around. The best combination of features is the one that minimizes the error of the model and it is obtained during the fitting process.

The decay function is modeled as a power law (the inverse of the distance raised to a power). There are a huge number of examples of power law distributions both in nature and in human interactions, from epidemic or tumor growth to information dissemination in social media or financial market fluctuations. This behavior is also found in human mobility patterns and observed in our training dataset, which allows us to choose the best value for the decay power.

Moreover, in our implementation we generalize this decay function to be store type and customer type dependent instead of just distance dependent. The underlying reasons are:

  • The influence area or range of action of a store in a shopping mall is not the same as the one of a store in the street, being the former much bigger and therefore requiring a smaller value for the power.
  • The mobility patterns in rural areas are usually not the same as the ones we find in big cities. In general, people living in places with less services are willing to move further.

A good way to include this effect is by making the decay function dependent on the store type and on the population density around the customer. With those new degrees of freedom we are able to model different types of stores and very different human mobility patterns together. This means we are able to include interactions between all stores and all customers without assuming a common behavior.

Moreover, the use of mobile traffic data allows us to use the real flow of people from one area to another, without needing to assume a fixed function. The addition of this information let us describe better areas with different mobility patterns (due to orography for example) and what is more important; to describe the sales of stores where a big proportion of its buyers do not visit the area because of proximity but for any other reasons (attractors, trading areas, …etc).

The third cornerstone of the gravity model is the estimation of the total consumption of the goods our stores sell. To continue with the simile with the gravity case, up to now we have talked about the potential created by the planet. The force underwent by a spaceship nearby would be proportional to that potential multiplied by the spaceship mass. In the Huff model case, the interaction force between the store and the customer is nothing but the store potential explained before multiplied by the customer consumption capability. At Geoblink, the exclusive licensing of BBVA, together with other data sources, allows us to estimate this available market with a significant degree of precision.

In some way we can see the Huff as a model that splits the whole customers consumption of a good between all the stores that sell that good. The more potential the store has for the buyers, the bigger part of the cake they would take.

Consumption split between points of sale

One of the benefits of this approach is that it allows us to simulate nearly any scenario we can think of: the opening of a new store, the opening of several stores (both at the same time or consecutively), the acquisition of a competitor … and estimate the impact of these actions on our network sales.

Summary

The Huff Model is a spatial interaction model which calculates probabilities of consumers going to stores using a gravity-like interaction. This model is able to correctly describe the intimate interplay between the buyers, our stores and the competitor stores.

The key of a successful model is to correctly describe the attractiveness of the store, the available market and the mobility patterns of the buyers.

--

--