PostgreSQL Analysis of the World’s Oldest Businesses

Favour Igwezeke
9 min readJan 15, 2024

--

Street Walkthrough in a business district

INTRODUCTION

In the dynamic landscape of global commerce, the longevity of a business stands as a testament to its resilience, adaptability, and strategic foresight. Some enterprises have not only weathered the storms of changing market conditions but have thrived for centuries. BusinessFinancing.co.uk, in a meticulous endeavor, has delved into the archives of economic history, researching and identifying the oldest companies still operational in almost every country. The findings of this comprehensive investigation have been distilled into a dataset that now invites exploration.

The tool used for this analysis was PostgreSQL.

Business Questions We’ll Unravel

  • What are the unique business categories in the dataset?
  • How many businesses are there in each category?
  • Which category has the highest number of businesses?
  • How many unique countries are covered in the dataset?
  • Which continent has the highest number of businesses?
  • What is the distribution of businesses across continents?
  • What is the range of founding years for businesses in the dataset?
  • How many businesses were founded in each decade?
  • What is the average founding year of businesses?
  • What are the 10 oldest businesses in the dataset, and what are their categories and countries?
  • Which categories have the highest percentage of businesses that have survived for over 200 years?
  • How has the average age of businesses in each category changed over the past 100 years?
  • What are the most common business categories within each country?
  • Which countries have the most diverse representation of business categories?
  • Identify any countries where a single category dominates the oldest businesses.
  • Have any business categories consistently produced long-lasting businesses across multiple countries?
  • Which categories have the most significant variation in business longevity across different countries?
  • Which countries have the most businesses represented in the dataset?
  • What are the top 5 countries with the most diverse range of business categories?
  • Identify any categories that have seen a significant increase or decrease in representation over time.
  • Analyze the distribution of businesses within each category by continent.
  • Explore the relationship between business age and category size.
  • Identify any trends in the founding dates of businesses within each country.
  • Investigate the evolution of dominant business categories across different country groupings.
  • Use window functions to calculate the “rolling age” of each business category within each country.

Problem Statement and Objectives

Our primary aim is to dissect the patterns and characteristics that have enabled certain businesses to endure for centuries. The specific objectives of this analysis include:

  • Investigating the commonalities among the world’s oldest businesses.
  • Analyzing the founding years and distribution of these businesses across categories and countries.
  • Uncovering any discernible trends or anomalies that may provide insights into the factors contributing to long-term success.

Data Structure and SQL Environment

The business dataset, meticulously compiled for exploration, has been seamlessly integrated into a PostgreSQL database. The database structure is designed with clarity, featuring three key tables:

  1. Categories:
  • category_code: Code for the category of the business.
  • category: Description of the business category.

2. Countries:

  • country_code: ISO 3166–1 3-letter country code.
  • country: Name of the country.
  • continent: Name of the continent that the country exists in.

3. Businesses:

  • business: Name of the business.
  • year_founded: Year the business was founded.
  • category_code: Code for the category of the business.
  • country_code: ISO 3166–1 3-letter country code.

This dataset for the study was sourced from Kaggle and is available for download here

A database schema diagram
Business Database Schema Diagram

Data Cleaning

The Oldest Businesses dataset has been thoroughly examined, comprising 3 tables and I’m pleased to report that the data is clean and well-suited for analysis. During our investigation, we found no significant issues such as duplicates, missing values, or incorrect data types.

In terms of consistency and accuracy, the tables show no outliers or anomalies that could impact our study. The data appears well organized, and there are clear relationships between the tables, contributing to its reliability.

Fortunately, the data didn’t require any cleaning or transformation efforts, as all columns contained the correct data types and values. We made a few minor adjustments to enhance the structure and organization of the tables for improved readability.

Overall, I have high confidence in the accuracy of the information, and I believe it will be a valuable asset to our study.

Questions and Answers

What are the unique business categories in the dataset?

We use DISTINCT to find the unique business categories in the dataset.

Unique business categories in the dataset — Image by the Author

How many businesses are there in each category?

We use COUNT and GROUP BY to see how many businesses are in each category.

Business count per category — Image by Author

Which category has the highest number of businesses?

We use COUNT, GROUP BY, and ORDER BY to find the category with the highest number of businesses.

Category with Most Businesses — Image by Author

How many unique countries are covered in the dataset?

We use DISTINCT to find the unique countries covered in the dataset.

Unique countries in the dataset — Image by Author

Which continent has the highest number of businesses?

We use COUNT, GROUP BY, and ORDER BY to find the continent with the highest number of businesses.

Continent with Most Businesses — Image by Author

What is the distribution of businesses across continents?

We use COUNT, GROUP BY, and ORDER BY to see the distribution of businesses across continents.

Business distribution across continents — Image by Author

What is the range of founding years for businesses in the dataset?

We use MIN and MAX to find the earliest and latest founding years, defining the range.

Range of Founding Years — Image by Author

How many businesses were founded in each decade?

We use COUNT, EXTRACT, and GROUP BY to find how many businesses were founded in each decade.

Businesses founded each Decade — Image by Author

What is the average founding year of businesses?

We use AVG and GROUP BY to find the average founding year of businesses.

Average Founding Year — Image by Author

What are the 10 oldest businesses in the dataset, and what are their categories and countries?

We use ORDER BY, LIMIT, and JOIN to find the 10 oldest businesses and their respective categories and countries.

Oldest Businesses — Image by Author

Which categories have the highest percentage of businesses that have survived for over 200 years?

We use COUNT, FILTER, and Percentages to determine which categories have the highest percentage of businesses that have survived for over 200 years.

Businesses that have survived for over 200 years — Image by Author

How has the average age of businesses in each category changed over the past 100 years?

We use AVG, EXTRACT, GROUP BY, and WHERE to analyze how the average age of businesses in each category has changed over the past 100 years.

Average age of businesses over the last 100 years — Image by Author

What are the most common business categories within each country?

We use COUNT, GROUP BY, ORDER BY, and INNER JOIN to find the most common business categories within each country.

Most common business category within each country — Image by Author

Which countries have the most diverse representation of business categories?

We use COUNT, GROUP BY, ORDER BY, and INNER JOINS to find countries with diverse representation across various business categories.

Unique representation of business categories across various countries — Image by Author

Identify any countries where a single category dominates the oldest businesses.

We use COUNT, GROUP BY, ORDER BY, and INNER JOIN to identify countries where a single category dominates the oldest businesses.

County where a single category dominates — Image by Author

Have any business categories consistently produced long-lasting businesses across multiple countries?

We use COUNT, GROUP BY, JOIN, and WHERE filtering conditions to determine if certain business categories consistently produce long-lasting businesses across multiple countries.

Business categories consistently producing long-lasting businesses — Image by Author

Which categories have the most significant variation in business longevity across different countries?

We use STDDEV, GROUP BY to identify categories with the most significant variation in business longevity across different countries.

Longevity Variation — Image by Author

Which countries have the most businesses represented in the dataset?

We use COUNT, GROUP BY, ORDER BY to find countries with the most businesses.

Countries with Most Businesses — Image by Author

What are the top 5 countries with the most diverse range of business categories?

We use COUNT, GROUP BY, ORDER BY to find the top 5 countries with a diverse range of business categories.

Top 5 Countries with Diverse Business Categories — Image by Author

Identify any categories that have seen a significant increase or decrease in representation over time.

We use COUNT, GROUP BY to identify categories that have seen a significant increase or decrease in representation over time.

Categories with Increase or Decrease in Representation Over Time — Image by Author

Analyze the distribution of businesses within each category by continent

We use COUNT, GROUP BY, JOIN to analyze how businesses are distributed within each category across different continents.

Distribution of Businesses within Each Category by Continent — Image by Author

Explore the relationship between business age and category size.

We use AVG, COUNT, GROUP BY to explore the relationship between the age of businesses and the number of businesses in each category.

Relationship Between Business Age and Category Size — Image by Author

Identify any trends in the founding dates of businesses within each country.

We use COUNT, GROUP BY to identify patterns and trends in the founding dates of businesses within each country.

Trends in Founding Dates of Businesses Within Each Country — Image by Author

Investigate the evolution of dominant business categories across different country groupings

We use COUNT, GROUP BY, JOIN to investigate how dominant business categories have evolved across different country groupings (e.g., continents, economic regions).

Evolution of Dominant Business Categories Across Country Groupings — Image by Author

Use window functions to calculate the “rolling age” of each business category within each country.

We use window functions, AVG to calculate the “rolling age” of each business category within each country, identifying periods of accelerated or decelerated growth in average age

Rolling Age of Business Categories — Image by Author

Thank you for coming to my Analysis of the World’s Oldest Businesses.

--

--

Favour Igwezeke

I help build the crossroads of data science, health informatics and art. And enjoy this thing called Life.