Interactive Map using Leaflet in R

Ajay Rao
5 min readSep 30, 2022

--

Description

This article takes us through an example of plotting Canada census population data by Forward Sortation Area (FSA). A FSA is the first 3-digit of postal codes. Refer to FSA Description for further information.

We need 3 sets of info for generating these plots:

You will notice that the shapefile and population data is from 2016. The most recent data available from Stats Canada at Forward Sortation Area level is from 2016.

Import Necessary Libraries

if (!require(pacman)) install.packages(“pacman”) library(pacman) p_load(data.table, leaflet, rgdal, sp, htmltools, pdftools, tidyverse, scales)

Using the p_loadfunction from the pacman library to load packages installs the libaries if they are not from CRAN repository.

The leaflet package plots the interactive map. rgdal is required to read the .shp file into R environment. sp converts data.frame or data.table into a spatial polygons data frame which is required by leaflet to create the plots.

Will introduce other packages when they are called into action.

Import the ShapeFile

## Download the zip file if (!”tFile” %in% list.files("Data/")) 
{
download.file(“https://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/files-fichiers/2016/lfsa000b16a_e.zip", “Data/tFile”)
}
## Extract the .shp file if (!”lfsa000b16a_e.shp” %in% list.files("Data/"))
{
unzip(“Data/tFile”)
}
## Load Shape File into R
shpDF <- readOGR(dsn=”Data/lfsa000b16a_e.shp”)

We check if the .zip file from Stats Canada containing .shp file is already downloaded. If not, we download the file and unzip the .shp file.

Once unzipped, the .shp file is read into R environment using the readOGR function from rgdal package.

Load Population Data

## Import .csv file using “fread” 
popDT <- fread(“Data/T1201EN.CSV”, nrows = 1642)
## We drop the frist row → this corresponds to Total Canada
## and the fourth column → this column doesn’t have any data
popDT <- popDT[-1,-4]
## Give easier to manage column names
colnames(popDT) <- c(“FSA”, “GEOName”, “Province”, “Population”, “Dwellings”, “DwellingsOccupied”)

The population data (“T1201EN.CSV”) is sourced from Statistics Canada, Catalogue no. 98–402-X2016001.

Import FSA Names from Canada Post PDF

## Download the pdf file 
if (!”fsa-list-april-2022.pdf” %in% list.files("Data/"))
{
download.file(“https://www.canadapost-postescanada.ca/cpc/doc/en/support/kb/nps/fsa-list-april-2022.pdf", “Data/fsa-list-april-2022.pdf”)
}
## Read the PDF file
pdfPageList <- pdf_text(“Data/fsa-list-april-2022.pdf”) %>% str_split(“\n”)
## The first page doesn’t contain any FSAs so we skip it
## The top 11 lines are headers
## The last two lines are footers
for(i in 2:15)
{
pdfPageList[[i]] <- pdfPageList[[i]][11:(length(pdfPageList[[i]]) — 2)]
}
rm(i)
## Convert the List to Table
myFunc <- function(x)
{
## This function converts list to data.table
return(data.table(x))
}
pdfPageTable <- do.call(“rbind”,lapply(pdfPageList[2:15], myFunc)) rm(pdfPageList) ## Split text into individual columns
### If there are 4 consecutive white space, we consider that as a separator
pdfPageTable[,paste0(“c”, seq(1, 4)):=tstrsplit(x, “\\s{4, }”)]
### Drop the original column
pdfPageTable <- pdfPageTable[,!”x”,with=F]
## Extract FSA codes and FSA Names
myFunc2 <- function(x, DT)
{
# For each column passed to this function
# First 3 characters are considered as FSA code
# The text starting from 4 character onwards is considered FSA Name
# The FSA codes and Names are stored as a data.table and returned
DT2 <- DT[,x,with=F]
DT2 <- DT2[!is.na(get(x)) & !(get(x) == “SUCC BUREAU-CHEF”) & !get(x)==””,]
DT2$FSA <- substr(trimws(DT2[[x]]), 1, 3)
DT2$FSAName <- trimws(substr(trimws(DT2[[x]]), 4, nchar(DT2[[x]])))
DT2 <- DT2[,!x,with=F] return(DT2)
}
modTable <- do.call(“rbind”, lapply(colnames(pdfPageTable), myFunc2, DT = pdfPageTable))
rm(pdfPageTable)

A person reading the map should be able to meaningfully identify a Forward Sortation Area on the map. The 3 digit FSA codes in the shapefile and the population data file don’t help with this.

Fortunately, Canada Post has supplied names to each 3 digit FSA code in the “fsa-list-april-2022.pdf”. The tables are read into R using the pdftools package. As the columns are not aligned, some additionally processing is required to get the FSA codes and names in a tabular format.

Merge Population Data with Shape File and Canada Post FSA names

## Merge the Shape File with Population Data 
shpPopDT <- merge(
x = shpDF[shpDF$PRNAME == “Ontario”,],
y = popDT,
by.x = “CFSAUID”,
by.y = “FSA”,
all.x = T
)
## Delete the shpDF to reduce memory space
rm(shpDF, popDT)
gc()
## Merge the FSA Names
shpPopDT <- merge(
x = shpPopDT,
y = modTable,
by.x = "CFSAUID",
by.y = "FSA",
all.x = T
)
rm(modTable)
## Convert the Merged File into a Shape File format
shpPopDT <- spTransform(shpPopDT, CRS("+init=epsg:4326"))
## Store a copy as a data.table to create labels
popDT <- data.table(shpPopDT@data)
popDT$Population <- comma(popDT$Population)
popDT$Dwellings <- comma(popDT$Dwellings)

We bring all the information together by merging the datasets.

Points to note:

  • We are keeping data for “Ontario” province only. This is done to reduce the file size while publishing the document to Quarto. However, the shiny app (link towards the end of this article) has all provinces.
  • The merged data frame has to be converted to spatial polygons data frame for leaflet to be able to plot the map. This is achieved using the spTransform function from the sp package.

Plot an Interactive Map Using Leaflet

## labels 
labs <- lapply(seq(nrow(popDT)), function(i)
{
paste0( ‘<p><b>’, popDT[i, “FSAName”], ‘</b></p><p>’, “<i>Population: </i>”, popDT[i, “Population”],’</p><p>’, “<i>Dwellings: </i>”, popDT[i, “Dwellings”], ‘</p>’ )
})
## Palette
pal <- colorNumeric(
palette = “Reds”,
domain = shpPopDT$Population)
## Plotting the map leaflet(shpPopDT) %>%
addTiles() %>%
setView(-84.81, 49.74, zoom = 5) %>%
addPolygons( data = shpPopDT,
fillColor = ~pal(Population), weight = 2, opacity = 1,
color = “white”, dashArray = “3”, fillOpacity = 0.7,
highlightOptions = highlightOptions( weight = 5,
color = “#666”, dashArray = “”, fillOpacity = 0.7,
bringToFront = TRUE), label = lapply(labs, HTML),
labelOptions = labelOptions(
style = list(“font-weight” = “normal”,
padding = “3px 8px”),
textsize = “15px”, direction = “auto”) ) %>%
addLegend(pal = pal, values = ~Population,
opacity = 0.7, title = “Population”, position = “bottomleft”)

The labels are formatted using html codes and are rendered using the HTML function from the htmltools package.

I have used “Red” color palette. There are other palettes available to use. Refer to https://www.nceas.ucsb.edu/sites/default/files/2020-04/colorPaletteCheatsheet.pdf for available options.

With few additions to the above, you can also create a Shiny App — https://ajaypratham.shinyapps.io/interactivefsageomapping.

Data files and associated codes, including the one to create Shiny App, can be accessed on Github — https://github.com/ajayrao1983/CanadaGeoMapping.

Hope you found this article helpful and informative. Please leave your thoughts, comments, feedbacks.

--

--

Ajay Rao

Data science enthusiast. Active practioner of continuous improvement and lean six sigma ideology. Firm believer in the concept of learning and sharing.