A Dive into Economic Data Using World Bank Databases in R

Dima Diachkov
Data And Beyond
Published in
5 min readAug 13, 2023

A couple of days ago I noticed that everyone in the news was talking about inflation this and next year, which is why it made me wonder if I need to show my readers how to extract macro data from World Bank API in R. So today, we’re going to embark on an exciting journey into the world of economic data again. Sounds thrilling, right? 🚀

This is part #33 of the “R for Applied Economics” guide, where we collectively explore various depths of R, data science, and financial and economic analysis.

Credits: Unsplash | Ben White

Introduction to World Bank Data 🏦

You probably know or heard what World Bank is (if you have not — no worries, check this out https://www.worldbank.org/en/home).

In short, the World Bank is not a bank in conventional ways of thinking. It is sort of a fund or a huge and kind sponsor. The World Bank offers financial support, guidance, and additional resources to nations in development, focusing on sectors such as education, public safety, health, and other essential needs. Or for development.

Because of this, World Bank has accumulated a lot of data about different countries. This is why they also made a kind donation to us, economists, and society, with the provision of open access to a plethora of economic and financial indicators that are vital for researchers, policymakers, and data analysts. From GDP to inflation rates, this data is a treasure trove for anyone looking to understand the global economy.

Just for reference, here is my earlier article about working with the ECB SDW database — https://medium.com/@the_lord_of_the_R/how-to-fetch-multiple-economic-datasets-from-the-ecb-database-with-r-feba4004a4c5

The WDI Package: your magic orb🧙‍♂️

The WDI (World Development Indicators) package in R is like a magic wand for working with World Bank Data. It’s a package that allows you to fetch, manipulate, and visualize data with ease. Let’s get started by installing it:

install.packages("WDI")
library(WDI)

You will also need the website to browse datasets, which are new for you >> https://data.worldbank.org/

Fetching your first dataset

You probably know that GDP represents the total value of goods and services produced within a country, while “per capita” means something like “per person”. Let’s fetch the GDP per capita data:

# first try
data.gdp = WDI(indicator="NY.GDP.PCAP.CD")
data.gdp
Output for the code above

Simple right? We have iso country codes, country names, and the value of the indicator in each available year.

But how did I know what to request, namely ”NY.GDP.PCAP.CD”? All you need to know is which tickers to request for your purpose. In order to get them just browse the World Bank Data website and go to the details sheet for the dataset you need. Here is an example for the GDP per capita data.

Step 1. Find the dataset you need

Step 2. Open “Details” and get the ticker name(field “ID”)

Just copy this or any other similar ID to your R code and WDI will deliver it to you.

Fetching a subset of data

Time filtering

Now we have the whole dataset for GDP per capita by country throughout the available history. That is amazing and we can work with it if we need to. But let me also show you how to specify what years you need to extract.

If we open WDI documentation, we will see how it works.

WDI description

So we have “start” and “end” arguments that we need to exploit. Let’s just try to extract data for 2022 only.

# time filtering
data.2022 = WDI(indicator="NY.GDP.PCAP.CD", start=2022, end = 2022)
data.2022
Output for the code above

Okay. Now we can filter data right in the API request to minimize execution time. But we still have to have many country datapoints and some of them may be not of interest to us…

Country filters

Again, according to the documentation, we can specify country codes. Let’s pretend that we want to focus on EU countries. I have created a separate vector to select only EU countries.

# country filtering
eu_country_codes <- c("BE","BG","CZ","DK","DE",
"EE","IE","GRC","ES","FR",
"HR","IT","CY","LV","LT",
"LU","HU","MT","NL","AT",
"PL","PT","RO","SI","SK",
"FI","SE")

data.EU_countries = WDI(indicator="NY.GDP.PCAP.CD",
country=eu_country_codes)
data.EU_countries
Output for the code above

And here we go. We now have only 27 countries left. This is how the main filters work. Let’s just quickly plot it to get a grip of data.

ggplot(data.EU_countries, aes(x = year, y = NY.GDP.PCAP.CD, color = country)) +
geom_line() +
ggtitle("Global GDP Over Time")
Output for the code above

Conclusion: The World at Your Fingertips 🌐

Basically, that is all you need to start working with WDI dataset. You can search, extract and filter datasets. Whether you’re a researcher, student, or just curious about the world, World Bank Data can provide invaluable insights to the world economy.

Feel free to share your thoughts, questions, or your own experiences with World Bank Data in the comments below. Let’s keep the conversation going.

Please clap 👏 and subscribe if you want to support me. Thanks! ❤️‍🔥

--

--

Dima Diachkov
Data And Beyond

Balancing passion with reason. In pursuit of better decision making in economic analysis and finance with data science via R+Python