OECD Stats Website: Your Go-To for Comprehensive Statistics datasets on OECD Countries

Akshay Bagal
4 min readOct 6, 2023

--

Introduction

The OECD Stats website is a comprehensive repository of statistical data on OECD countries and selected non-member economies. It provides users with access to a wide range of data on economic, social, and environmental indicators, covering topics such as GDP, employment, education, health, and more. The website also includes a data warehouse, which allows users to create their own queries and download data in a variety of formats.

Data Warehouse

The OECD Stats data warehouse is a powerful tool that allows users to access and analyze data on OECD countries and selected non-member economies in a variety of ways. The data warehouse contains data from over 200 different OECD databases, covering a wide range of topics. Users can create their own queries to select the data they need, and they can download the data in a variety of formats, including CSV, Excel, and JSON.

Kind of Data Available

The OECD Stats data warehouse contains data on a wide range of economic, social, and environmental indicators. Some of the most popular data sets include:

  • GDP and national accounts
  • Employment and labor markets
  • Education and skills
  • Health and well-being
  • Environment and sustainable development
  • Trade and investment
  • Innovation and science

Export Option Using API

The OECD Stats website also provides an API that allows users to export data from the data warehouse. The API is a simple and easy-to-use tool that allows users to download data in a variety of formats, including CSV, Excel, and JSON.

Use Cases for the OECD Stats Data Warehouse

The OECD Stats data warehouse can be used for a variety of purposes, such as:

  • Conducting research on economic, social, and environmental issues
  • Benchmarking performance against other OECD countries
  • Developing and implementing policies
  • Creating data visualizations and presentations

Let’s take annual inflation data export for an example:

We get a couple of options before

  1. Customize: You can customize the selection (frequency (yearly- monthly- quarterly), subject and countries) and Layout properties.
  2. Export: Once you decide the frequency of the data and list of countries, you can click on export to see the available options. (Excel, CSV, PC-Axis, Developer API, XML.
  3. My Queries: You can save the queries/filters that you use frequently in this section.

When you click on Developer API, you’ll get a pop-up to generate the query to run via REST API.

When you click on Generate API queries, you will get 2 links. First one for Data query and another for Structure query.

Copy the Data Query and use it in the Python script for OECD API as follows:

import requests

url = "https://stats.oecd.org/SDMX-JSON/data/PRICES_CPI/AUS+AUT+BEL+CAN+CHL+COL+CRI+CZE+DNK+EST+FIN+FRA+DEU+GRC+HUN+ISL+IRL+ISR+ITA+JPN+KOR+LVA+LTU+LUX+MEX+NLD+NZL+NOR+POL+PRT+SVK+SVN+ESP+SWE+CHE+TUR+GBR+USA+EU27_2020+EA20+G-7+OECDE+G-20+OECD+NMEC+ARG+BRA+CHN+IND+IDN+RUS+SAU+ZAF.CPALTT01+CP18ALTT.GY.M/all?startTime=2015-01&endTime=2023-09&dimensionAtObservation=allDimensions"

response = requests.request("GET", url)

We get a json response for the data request call. It’s more tricky to flatten this data into a flat-table. We have to see the schema of the json response and create a separate script to flatten the json response.

for ex.

data = response.json()
data_values = data['dataSets'][0]['observations']
dimensions = data['structure']['dimensions']['observation']

id_to_name_mappings = {
dim['name']: {item['id']: item['name'] for item in dim['values']}
for dim in dimensions
}

dimension_values = [dim['values'] for dim in dimensions]

def get_id_from_index(dim_index, index):
return dimension_values[dim_index][index]['id']

def map_id_to_name(dim_name, id):
return id_to_name_mappings[dim_name].get(id, id)

rows = []
for key, value in data_values.items():
indices = key.split(':') # Split keys into separate dimension indices
country = map_id_to_name('Country', get_id_from_index(0, int(indices[0])))
subject = map_id_to_name('Subject', get_id_from_index(1, int(indices[1])))
measure = map_id_to_name('Measure', get_id_from_index(2, int(indices[2]))) # Corrected typo here
frequency = map_id_to_name('Frequency', get_id_from_index(3, int(indices[3])))
time = map_id_to_name('Time', get_id_from_index(4, int(indices[4])))
data_value = value[0] # Extract the data value
rows.append([country, subject, measure, frequency, time, data_value]) # Append the row to the list of rows

df = pd.DataFrame(rows, columns=['Country', 'Subject', 'Measure', 'Frequency', 'Time', 'Value'])
display(df)

FYI: The script for flattening might differ for different datasets depending on the layout of data export.

The OECD Stats API documentation provides more information on how to use the API to export data.

Conclusion

The OECD Stats website is a valuable resource for anyone who needs access to comprehensive statistics on OECD countries and selected non-member economies. The website’s data warehouse is a powerful tool that allows users to access and analyze data in a variety of ways. The API also makes it easy to export data from the data warehouse.

--

--

Akshay Bagal

SSDE at Coforge, love talking about python and almost everything around that :-)