How to do an interactive map using Leaflet in R: an example with the Marginalization Index Level by municipality in Sonora, Mexico (2010–2020)

Santiago Robles Tamayo
MCD-UNISON
Published in
13 min readDec 1, 2022

The code in this article is based on the work of Luis Armando Moreno Preciado, a senior classmate at the Graduate Data Science Program of the University of Sonora. He plotted an interactive map showing the Marginalization Index in Sonora, in an urban and municipal level, in 2020; here we focus on one interactive map on a municipal level, for the same variable, in 2010 and 2020. I’d also like to thank Luis for his key roll as editor for this text. For more examples about data visualization in Sonora and other regions of Mexico done in R, visit his Medium https://medium.com/@dogomoreno or webpage https://www.luisarmandomoreno.com/.

Overview

After describing what is a Marginalization Index and what is Leaflet, we go through the steps necessary to create and interactive map on R using Leaflet. First, we install and activate the libraries needed, respectively. Then we download the data from Mexican Government URL sources for both 2010 and 2020. After that, we tidy the data, filtering the regions and variables of our interest, as well as creating a shape object when merging the data with a shape file of the State of Sonora. Later, we create an object that contains the color palette we will use to visualize the different levels of marginalization in our maps. Before plotting, we code the lines that make the popups when clicking on each municipality’s polygon. Finally, we plot the map adding the required layers, and make a simple descriptive analysis of the marginalization level by municipality between the two years. The last two parts consists of recommended online courses on the topic and a series of references.

What is the Marginalization Index?

According to the National Population Council of Mexico (CONAPO, by its acronym in Spanish), it is a number that can contrast regions, in different geographical units — state, municipality, urban area — according to the global impact of the lack of services, such as education, inadequate housing conditions, households with a minimum wage and lack of goods. Other similar indexes are the Social Exclusion Index and the Multidimensional Deprivation Index, this one usually used for urban areas.

What is Leaflet?

Popular for making maps in multiple applications, Leaflet is “the leading open-source JavaScript library for mobile-friendly interactive maps. Weighing just about 42 KB of JS, it has all the mapping features most developers ever need”. (Leaflet, 2022). It was developed by Volodymyr Agafonkin on May 13, 2011. Though initially used in JavaScript, the library attracted attention on both R and Python users. In the case of R, it can customize maps installing and setting the “leaflet” package and library.

Installation of libraries

After setting the directory or creating a new RProject, we need to install the following libraries. Consider that some of them only work on the most recent version of R, so be sure to check your R version on Tools > Global Options…

```{r install libraries}

install.packages("magrittr")
install.packages("readxl")
install.packages("tidyverse")
install.packages("htmltools")
install.packages("htmlwidgets")
install.packages("rgdal")
install.packages("rgeos")
install.packages("rcartocolor")
install.packages("leaflet")

```

After successfully installed, all packages must be activated using the command library().

```{r Activate libraries}

library(magrittr)
library(readxl)
library(tidyverse)
library(htmltools)
library(htmlwidgets)
library(rgdal)
library(rgeos)
library(rcartocolor)
library(leaflet)

```

Every library has an specific function: magrittr is used to apply the “pipe” function, better known as %>%, to simplify when adding new features to an object; the command readxl will helps us get data out of an .xls or .xlsx file; the tidyverse library is necessary for data manipulation; we will use htmltools to add html features to our interactive map; htmlwidgets is used when converting our map to an html document; when using geographical files, we will execute both rgdal and rgeos; rcartocolor is for coloring our map; and leaflet is to create interactive maps.

Data download

There are multiples useful sources of data online to create maps. At an international level, there are the World Bank, the Organization for Economic Cooperation and Development (OECD), the International Monetary Fund, etc. At a national or regional scale, there are national institutions, such as the USA National Census Bureau, the Bureau of Economic Analysis, the National Institute of Geography, Statistics and Information of Mexico, etc. The data can be found in different formats, but the most common are Excel files, as it is in our case.

Since the data in our case, for both years — 2010 and 2020 — , is arranged in Excel files located at the web page of the National Population Council of Mexico, we use the read_excel command to extract the data from the second sheet of both files in the URL.

```{r Data Download}

url_2010 = "http://www.conapo.gob.mx/work/models/CONAPO/Marginacion/Datos_Abiertos/Municipio/IMM_DP2_2010.xlsx"
p1f <- tempfile()
download.file(url_2010, p1f, mode="wb")
data2010 <- read_excel(path = p1f, sheet = 2)
View(data2010)

url_2020 = "http://www.conapo.gob.mx/work/models/CONAPO/Marginacion/Datos_Abiertos/Municipio/IMM_2020.xls"
p1f <- tempfile()
download.file(url_2020, p1f, mode="wb")
data2020 <- read_excel(path = p1f, sheet = 2)
View(data2020)

```

Once we saved our data in two data frames, data_2010 and data_2020, we can use the command View() to look at its columns and structure. Both are data frames of size 2456 rows by 17 columns, since they carry the data from all municipalities of Mexico in different variables, which are:

· CVE_ENT: Id by Mexican State in alphabetical order, for all 32 states, where 01 stand for Aguascalientes and 32 for Zacatecas.

· NOM_ENT: Name of the Mexican State

· CVE_MUN: Id by Mexican Municipality in alphabetical order, where the two first digits correspond to the Id by Mexican State (CVE_ENT).

· NOM_MUN: Municipality name

· POB_TOT: Total population

· ANALF: illiterate percentage of people 15-year-old or older.

· SBASC: percentage of people 15-year-old or older without basic education

· OVSDE: percentage of occupants in inhabited private homes without drainage or toilets.

· OVSEE: percentage of occupants in inhabited private homes without electricity

· OVSAE: percentage of occupants in inhabited private homes without piped water

· OVPT: percentage of occupants in inhabited private homes with a dirt floor

· VHAC: percentage of private homes with overcrowding

· PL.5000: percentage of population living in localities with less than 5,000 inhabitants

· PO2SM: percentage of employed population with income up to 2 minimum wages (one minimum wage in Mexico is $185.56 pesos = $9.57 dollars, as of 2022).

· IM_2010: Marginalization Index, 2010

· GM_2010: Level of Marginalization, 2010

· IMN_2010: Normalized Marginalization Index, 2010. The lower its value, the greater is the degree of multidimensional depravation.

· IM_2020: Marginalization Index, 2020. The lower its value, the greater is the degree of multidimensional depravation.

· GM_2020: Level de Marginalization, 2020, which ranges are “Very low”, “Low”, “Medium”, “High” and “Very high”.

· IMN_2020: Normalized Marginalization Index, 2010. The lower its value, the greater is the degree of multidimensional depravation.

Still, only some of these columns (variables) are of our interest since we’ll plot the level of Marginalization and create a popup with information of each municipality.

Tidying data

Despite having already a data frame on a regional scale, we are only interested in the municipalities of the State of Sonora and, particularly, to measure the level of Marginalization. To do so, first we need to create a vector with the levels of Marginalization. Before that, to have coherency in the language used in our map, we change the names of the levels of Marginalization from Spanish to English.

```{r Changing categorical values from Spanish to English}

data2010$GM_2010[data2010$GM_2010 == 'Muy bajo'] <- 'Very low'
data2010$GM_2010[data2010$GM_2010 == 'Bajo'] <- 'Low'
data2010$GM_2010[data2010$GM_2010 == 'Medio'] <- 'Medium'
data2010$GM_2010[data2010$GM_2010 == 'Alto'] <- 'High'
data2010$GM_2010[data2010$GM_2010 == 'Muy alto'] <- 'Very high'


data2020$GM_2020[data2020$GM_2020 == 'Muy bajo'] <- 'Very low'
data2020$GM_2020[data2020$GM_2020 == 'Bajo'] <- 'Low'
data2020$GM_2020[data2020$GM_2020 == 'Medio'] <- 'Medium'
data2020$GM_2020[data2020$GM_2020 == 'Alto'] <- 'High'
data2020$GM_2020[data2020$GM_2020 == 'Muy alto'] <- 'Very high'

```

And we create a vector named levels, which will help us arrange our data according to the level of Marginalization in columns GM_2010 and GM_2020.

```{r Level’s vector}
levels=c("Very low", "Low", "Medium", "High", "Very high")
```

Now that we changed the values in columns GM_2010 and GM_2020 for both of our data frames and created the vector levels, we filter the municipalities of Sonora in function of levels of marginalization, creating two new objects: municipal_sonora_2010 and municipal_sonora_2020.

```{r filtered data according to SEI level}

municipal_sonora_2010 <- data2010 %>%
filter(NOM_ENT=="Sonora") %>% mutate(GM_2010=factor(GM_2010,levels))
municipal_sonora_2010 %>% group_by(GM_2010) %>% summarise(n())
municipal_sonora_2010

municipal_sonora_2020 <- data2020 %>%
filter(NOM_ENT=="Sonora") %>% mutate(GM_2020=factor(GM_2020,levels))
municipal_sonora_2020 %>% group_by(GM_2020) %>% summarise(n())
municipal_sonora_2020

```

This is the data we’ll plot in one interactive map. It is necessary to assure we can merge the shape file of the municipalities of Sonora with our filtered data. To do so, we rename the column CVE_MUN to CVEGEO. The motive of this indication will be clarified when managing our shape file for the map.

```{r Data merge}

municipal_sonora_2010 <- rename(municipal_sonora_2010, CVEGEO = CVE_MUN)
municipal_sonora_2020 <- rename(municipal_sonora_2020, CVEGEO = CVE_MUN)

```

Download and set out of shape file

Next, we need to download the shape file of Mexico and the State of Sonora in the following link. Remember to save them in your local directory or RProject carpet.

https://www.inegi.org.mx/contenidos/productos/prod_serv/contenidos/espanol/bvinegi/productos/geografia/marcogeo/889463807469/26_sonora.zip

The national shape file is needed to create the national cartography, from which we’ll identify the State of Sonora using the first two values of the geographical id, also known as CVGEO (this variable will be used to merge the shape file with the municipal_sonora_2010 and municipal_sonora_2020 data frames).

```{r Setting of shape files}

mun_base<- readOGR(dsn = "00mun.dbf", layer = "00mun")
ent <- readOGR(dsn = "00ent.dbf", layer = "00ent")
mun_base@data$id_ent <- substr(mun_base@data$CVEGEO,1,2)

```

Subsequently, we create the shape object for the two periods, specifying the corresponding longitude and latitude of Sonora with the command CRS.

```{r Creating the shape object}

# 2010
mun_Son_2010 <- mun_base[mun_base@data$id_ent=="26",]
mun_Son_2010 <- spTransform(mun_Son_2010,
CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))

# 2020
mun_Son_2020 <- mun_base[mun_base@data$id_ent=="26",]
mun_Son_2020<- spTransform(mun_Son_2020,
CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))

```

Next, we merge the filtered data with the shape objects.

```{r Merging data with shape files}

# 2010
mun_Son_2010 <- mun_Son_2010 %>% merge(municipal_sonora_2010)
# 2020
mun_Son_2020 <- mun_Son_2020 %>% merge(municipal_sonora_2020)

```

Once the shape files are merge with our data, we can move on to the map customizations.

Map colors and popups

A diverging color palette, from lighter to darker, is usually used to visualize different degrees of a variable on a map: “if your story emphasizes the highest (=darkest) values, go for a sequential color scale; if your story is about the lowest and highest values, go for a diverging scale” (Muth, 2021). This is the case of the Marginalization Index Levels. To create a color palette with such characteristics, linked to the vector levels created previously, we run the following two lines of code.

```{r Color palette}

Colors <- carto_pal(5, "TealRose")
margpal <- colorFactor(Colors, levels=c("Very low", "Low", "Medium", "High", "Very high"), na.color =alpha("#e8e6e6", 0))

```

After the color palette is set, we customize the popups. These are sets of information that pop up of a figure or polygon when clicking it; it’s one of the main features that make a map interactive. The following lines of code reflect the popups that will appear when clicking on the municipalities of Sonora, showing some of the features mentioned earlier (Municipality name, total population, Marginalization Level, etc.). The code is written using HTML, specifying the name of each feature that the pop up will show, in function of the variables selected from our data frames data2010 and data2020.

```{r popup 2010 y 2020}

popup_2010 <- paste0(
"<b>", "Municipality: ", "</b>", as.character(mun_Son_2010$NOM_MUN),"<br>",
"<b>", "Marginalization level: ", "</b>", as.character(mun_Son_2010$GM_2010), "<br>",
"<b>", "Total population: ", "</b>", prettyNum(as.numeric(mun_Son_2010$POB_TOT), big.mark=",", preserve.width="none"), "<br>",
"<b>", "illiterate percentage of people 15-year-old or older: ", "</b>", round(mun_Son_2010$ANALF,1),"%", "<br>",
"<b>", "percentage of people 15-year-old or older without basic education: ", "</b>", round(mun_Son_2010$SBASC,1),"%","<br>",
"<b>", "percentage of occupants in inhabited private homes without drainage or toilets: ", "</b>", round(mun_Son_2010$OVSDE,1), "%","<br>",
"<b>", "percentage of occupants in inhabited private homes without electricity: ", "</b>", round(mun_Son_2010$OVSEE,1), "%", "<br>",
"<b>", "percentage of occupants in inhabited private homes without piped water: ", "</b>", round(mun_Son_2010$OVSAE,1), "%","<br>",
"<b>", "percentage of occupants in inhabited private homes with a dirt floor: ", "</b>", round(mun_Son_2010$OVPT,1), "%","<br>",
"<b>", "percentage of private homes with overcrowding: ", "</b>", round(mun_Son_2010$VHAC,1), "%","<br>") %>%
lapply(htmltools::HTML)


popup_2020 <- paste0(
"<b>", "Municipality: ", "</b>", as.character(mun_Son_2020$NOM_MUN),"<br>",
"<b>", "Marginalization level: ", "</b>", as.character(mun_Son_2020$GM_2020), "<br>",
"<b>", "Total population: ", "</b>", prettyNum(as.numeric(mun_Son_2020$POB_TOT), big.mark=",", preserve.width="none"), "<br>",
"<b>", "illiterate percentage of people 15-year-old or older: ", "</b>", round(mun_Son_2020$ANALF,1),"%", "<br>",
"<b>", "percentage of people 15-year-old or older without basic education: ", "</b>", round(mun_Son_2020$SBASC,1),"%","<br>",
"<b>", "percentage of occupants in inhabited private homes without drainage or toilets: ", "</b>", round(mun_Son_2020$OVSDE,1), "%","<br>",
"<b>", "percentage of occupants in inhabited private homes without electricity: ", "</b>", round(mun_Son_2020$OVSEE,1), "%", "<br>",
"<b>", "percentage of occupants in inhabited private homes without piped water: ", "</b>", round(mun_Son_2020$OVSAE,1), "%","<br>",
"<b>", "percentage of occupants in inhabited private homes with a dirt floor: ", "</b>", round(mun_Son_2020$OVPT,1), "%","<br>",
"<b>", "percentage of private homes with overcrowding: ", "</b>", round(mun_Son_2020$VHAC,1), "%","<br>") %>%
lapply(htmltools::HTML)

```

Now that we installed and set the necessary packages, downloaded and arranged our tidy data on a shape object, and set the color palette and pop ups, we can finally move on to the main section of this article: plotting an interactive map.

Interactive map plotting

Creating a map on RStudio can range from simply show a shape file or polygon to add as many layers as desired. Depending on the phenomenon of study, this decision is made by the author. However, because of its features, an interactive map needs more lines of code than an average one. Below is the code we’ll use to create the maps for both periods. For motives of length, instead of describing on a paragraph the function of every layer, they are shown at the end of each line of code as a comment. Notice how the core of these lines is the two addPolygons layers, which add the mun_Son_2010 and mun_Son_2010 shape objects.

```{r Marginalization index level map by municipality in Sonora 2010-2020}

map_mun_marg_2010_2020 <- leaflet(mun_Son_2010) %>%
addProviderTiles(providers$CartoDB.Voyager) %>% # map backgradoun color
addPolygons(data= mun_Son_2010, # data layer
stroke= TRUE, # Border lines
weight=0.5, # Border lines thicness
opacity=1, # Opacity of border lines
color= "black", # Boder line color
fillColor = ~margpal(mun_Son_2010$GM_2010), # Use of color palette
fillOpacity = 0.6, # Background opacity
smoothFactor = 1, # Smooth factor for border lines
highlightOptions = highlightOptions(color = "black", # Highlight the polygons where mouse is set
weight = 1.2,
bringToFront = TRUE),
popup = popup_2010, # Popup
popupOptions = labelOptions(noHide = F, direction = "auto", closeOnClick = TRUE, # popup options
style = list( # popup characteristics
"color" = "black",
"font-family" = "Arial",
"font-style" = "regular",
"box-shadow" = "2px 2px rgba(0,0,0,0.25)",
"font-size" = "8px",
"border-color" = "rgba(0,0,0,0.5)"
)),
group= "2010") %>% # Name of the menu option. This would be more usefeull when adding more data layers.
addPolygons(data= mun_Son_2020, # Layer of with data from 2020
stroke= TRUE, # Border lines
weight=0.5, # Border lines thicness
opacity=1, # Opacity of border lines
color= "black", # Boder line color
fillColor = ~margpal(mun_Son_2020$GM_2020), # Use of color palette
fillOpacity = 0.6, # Background opacity
smoothFactor = 1, # Smooth factor for border lines
highlightOptions = highlightOptions(color = "black", # Highlight the polygons where mouse is set
weight = 1.2,
bringToFront = TRUE),
popup = popup_2020, # 2020 Popup
popupOptions = labelOptions(noHide = F, direction = "auto", closeOnClick = TRUE, # popup options
style = list( # popup characteristics
"color" = "black",
"font-family" = "Arial",
"font-style" = "regular",
"box-shadow" = "2px 2px rgba(0,0,0,0.25)",
"font-size" = "8px",
"border-color" = "rgba(0,0,0,0.5)"
)),
group= "2020") %>% # Menu option for 2020
addLegend(position = "bottomleft", pal = margpal, values = ~mun_Son_2010$GM_2010, opacity=1, group= "Social Exlusion Index", # Legend
title = "MARGINALIZATION INDEX LEVEL IN SONORA; <br>CONAPO, 2010-2020<br>(click on the area of interest for more info.)", na.label = "No aplica") %>%
addLayersControl(
baseGroups = c("2010", "2020"),
options = layersControlOptions(collapsed = FALSE, position = "bottomleft"))

map_mun_marg_2010_2020

```

As an extra, we add this next line at the end of the previous map code to save them as an HTML file in our local directory or RProject carpet.


```{r HTML file save}

saveWidget(map_mun_marg_2010_2020,"marg_Son_2010_2020.html", title= "Marginalization Index in Sonora: 2010-2020", selfcontained = T, libdir = "lib")

```

Our map should look as shown below.

Figure 1: Marginalization Index Level by municipality in Sonora, 2010–2020. View when selecting the “2010” group. Own elaboration with R.
Figure 2: Marginalization Index Level by municipality in Sonora, 2010–2020. View when selecting the “2020” group. Own elaboration with R.

And, when clicking on a municipality’s polygon, it should look like this:

Figure 3: Marginalization Index Level by municipality in Sonora, 2020. Example when clicking over a municipality polygon to show a popup. Own elaboration with R.

Despite few, there are changes over time on Marginalization Levels along the municipalities of Sonora. For example, in the north, Santa Cruz, Arizpe and Bacoachi transcended from a low to a very low level of marginalization, but Trincheras, Tubutama and Saric maintained their position on the low level. Also, on the centered of the State, while Cururpe’s level passed from medium to low, San Miguel de Horcacitas level increased from low to medium. And, at the south-east, Yécora, Rosario and Alamos kept their respective levels on medium, as well as Quiriego on high.

Even though the previous paragraph was just a descriptive analysis, this kind of maps can help to see the changes not only on matters of marginalization, but education, basic services, natural resources, commerce, highway maintenance, natural area preservation and even violence or drug trafficking. Mapping over time this kind of events can develop better understanding of a region to apply national and subnational public policy to increase the well-being of the population.

If interested on data mapping and visualization using R, I’d encourage the reader to seek the following courses online: Data Science: Visualization on Edx by Harvardx (https://learning.edx.org/course/course-v1:HarvardX+PH125.2x+3T2022/home) and Advance Data Visualization with R on Coursera, offered by John Hopkings University (https://www.coursera.org/learn/jhu-advanced-data-visualization-r).

References

· Armando Moreno, L. (n.d.). GitHub — dogomoreno/SonoraMarginacion: Mapa coroplético interactivo de las agebs urbanas de Sonora según su grado de marginación. Calculado por CONAPO a partir de la información del Censo 2020 de INEGI. GitHub. Retrieved November 28, 2022, from https://github.com/dogomoreno/SonoraMarginacion

· Color Use Guidelines. (n.d.). Faculty of Science — Charles University. https://web.natur.cuni.cz/%7Elanghamr/lectures/vtfg1/mapinfo_2/barvy/colors.html

· CONAPO. (2012). Índice de marginación por localidad. In CONAPO (№978–607–427–128–7). Comisión Nacional de Población. http://www.conapo.gob.mx/work/models/CONAPO/indices_margina/2010/documentoprincipal/Capitulo01.pdf

· Leaflet — an open-source JavaScript library for interactive maps. (n.d.). Leafletjs. from https://leafletjs.com/

· Leaflet for R — Introduction. (n.d.). https://rstudio.github.io/leaflet/

· Muth, L. C. (2021, March 22). When to use sequential and when to use diverging color scales. Datawrapper Blog. https://blog.datawrapper.de/diverging-vs-sequential-color-scales/

· National Center for Ecological Analysis and Synthesis. (n.d.). Overview of Coordinate Reference Systems (CRS) in R. https://www.nceas.ucsb.edu/sites/default/files/202004/OverviewCoordinateReferenceSystems.pdf

· Read Excel Files. (n.d.). https://readxl.tidyverse.org/

· Tidyverse. (n.d.). https://www.tidyverse.org/

· Tools for HTML. (n.d.). https://rstudio.github.io/htmltools/

--

--

Santiago Robles Tamayo
MCD-UNISON

Economist | Enrolled student at the Master of Data Science Program of the University of Sonora