# Part 2: Greater London

In the first part of this series, which can be found here, I looked at the store and population distribution at the UK national level.

In part 2, I have concentrated the analysis on London UK, principally because I live and work there. I have also enhanced the spatial distribution analysis by moving from a cumulative analysis to a density analysis using either stores per square kilometre or population per square kilometre.

In part 3, I will be analysing the Kingston and Wimbledon area of London, in particular aiming to account for the success of the Tesco Extra store at New Maden.

The theoretical expectations for these distributions according to the Huff Law and law of retail gravity is that they should have power law forms.

I found QGIS software fantastic for analysing and displaying GIS data and inspiring further exploration.

My opening the dataset in Tableau I can obtain from lovely clear maps of both the distribution of stores and the population distribution around London.

Using matlibplot, geopandas and basemap have the advantage that I can now bring to bear the tools offered by python including haversine, numpy and scipy and scikit-learn to analyse the distributions.

Using shapefiles for Greater London postcodes I’m able to use sjoin to select only those stores in the Greater London area. Then I can use the Haversine function to measure one-kilometre increments from central London and subsequently measure the density of the stores at increasing distances.

Using geopandas and shapefiles I can plot a choropleth map of the population of Greater London. The haversine technique again enables me to measure how the population distribution depends on the distance.

By plotting the choropleth map of population and the retail locations for Greater London in the same figure we get an idea of how the two distributions are related.

Using statsmodels and seaborn I can calculate and plot the correlation between the retail and population distributions. There is a quite good correlation between these two distributions with R-squared: 0.935

A power law trend line fits well to both the population spatial distribution and the retail store spatial distribution, confirming the utility of the Huff model and the retail gravity law for the actual measured store and population distributions.

Written by

## David Horgan

#### I am a theoretical physicist with a data science background. At present, I am developing a UK retail market using ABM, ML and computational econometrics.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just \$5/month. Upgrade