2023 Government AI Readiness Index: data visualization using choropleth maps in ggplot2.

Kavengik
4 min readMay 3, 2024

--

In the process of undertaking some AI research, I came across the 2023 Government AI readiness Index by Oxford Insights. I found it quite informative in understanding the global AI landscape. The report has some useful visualizations accompanying the narrative. One useful visualization that I felt would have been a wonderful addition to the report was a choropleth map. Therefore, the objective of this article is to create a choropleth map to visualize the AI readiness scores in the report using ggplot2 in R.

Photo by Artem Beliaikin on Unsplash
  1. Load libraries.

For this tutorial, we shall be using the following libraries in R: dplyr, ggplot2, countrycode and readxl.

library(dplyr)                               #data manipulation
library(ggplot2) #choropleth visualization
library(countrycode) #getting country codes for country data
library(readxl) #read excel data file into R

2. Load data

We shall use the 2023 Government AI Readiness Index data from Oxford Insights, which can be accessed here. I used the syntax below to read the dataset into my Kaggle notebook.

data =read_excel('/kaggle/input/government-ai-readiness-index/2023-Government-AI-Readiness-Index-Public-Indicator-Data.xlsx')

3. Data Overview

We begin by having an overview of the dataset: the first five observations, dimension and data structure.

head(data)     # first five observations
dim(data) # data dimension
str(data) # data structure
data overview

4. Load global map data

We load the global map data within R. This data frame contains global geospatial data which is essential to plot a choropleth map of the world.The data frame contains the latitude and longitude coordinates for the regions of the world as seen below.

mapData <- map_data('world')           # Coordinates of countries
head(mapData) #first five observations of Map data

5. Creating a merging variable

To create the choropleth map, we first need to merge the two data frames, so that each country in the AI readiness report is paired with its respective geospatial data. The countries in both data frames do not always have the same notation so simply merging on country name can bring challenges. A consistent variable to merge on is the country code.

Hence, the first step is to create a country code variable in each of the data frames. To do this we use a syntax from the countrycode library and create a variable ‘country_code’ that is derived from converting the country name to the three letter country code (ISO-3 code).

#i.creating a country code variable (country_code)
data <- data %>% mutate(country_code= countryname(Country,destination='iso3c'))
mapData <- mapData %>% mutate(country_code=countryname(region,destination='iso3c'))

#ii.Confirm creation of country code variable in each of the data frames
head(data)
head(mapData)
data: First 6 observations
mapData data frame : First 6 observations

6. Merging the two data frames

We then merge the two data frames using the country code variable. The order of data frames matters when joining. As our interest is plotting AI index scores, our objective is that each country in the report’s data frames is paired with its respective geospatial data. Hence the left join ensures that countries in the report are merged with their geospatial data. Any country in the mapData frame with no observations in the report is ommited.

ai_index_scores <- left_join(data,mapData,by='country_code') #merged data frame
head(ai_index_scores) #first five observations
dim(ai_index_scores) #dimension of data frame
first six observations of the ai_index_dataframe
ai_index_scores data frame: first 6 observations

7. Creating the choropleth map

We are now ready to create the choropleth map. We use ggplot2 to achieve this, using the syntax below.

#Basic map
ggplot(ai_index_scores,aes(x=long,y=lat,group=group))+
geom_polygon(aes(fill=Total))

8.Next we make some modifications

We make some modifications on the following attributes: color, title, subtitle, legend appearance, axis text, axis title and panel background appearance. The explanations for additional code sections are preceded by the harsh symbol (#). Below is the code.

# Modified map
modified <-ggplot(ai_index_scores,aes(x=long,y=lat,group=group))+
geom_polygon(aes(fill=Total))+
labs(title='2023 Government AI Readiness Index',
subtitle='Scores') +
scale_fill_gradient(low='yellow', #yellow =low scores; red high scores
high='red')+
theme(axis.title= element_text(size=20),
axis.title.x=element_blank(), #remove x label (long),
axis.title.y=element_blank(), #remove y label (lat),
axis.text.x=element_blank(), #remove x-axis labels,
axis.text.y=element_blank(), #remove y-axis text
panel.background=element_blank(), #make background blank
legend.position='bottom', #position legend at the bottom
legend.key.width= unit(5,'cm'), #adjust width of legend
legend.key.height=unit(1,'cm')) #adjust height of legend
modified

9. Link: Kaggle notebook

End of tutorial. Happy coding!

--

--

Kavengik

Multi-hyphenate: Econometrics| Machine Learning| Unsupervised learning| Data analysis| Visual artist| Writer| cat mum Monday: Introspection Friday: Coding