Photo by amol sonar on Unsplash

Connectivity for All! Assessment of the Digital Divide and Its Impact on Human Development

Applying machine learning to better understand the impact of the digital divide.

Ahmed Jyad
Published in
6 min readFeb 20, 2021

--

Authors: Ahmed Jyad, Gian Atmaja, Alla Sailer, Nishrin Kachwala

Image Source: Nishrin Kachwala and Pixbay

During the pandemic of COVID-19 throughout the world, the population has experienced connectivity more critical than ever. Many companies switched to working from home, and many schools switched to homeschooling over different platforms. The fortunate population with good digital connectivity continued to work online or participate in online classes as if nothing had happened. Those underrepresented in the access to connectivity were left behind, further widening the gap. A gap between people with internet access and people without any connection to the internet is called the Digital Divide.

Several factors such as technical limitations like electricity access and internet speed and governments’ restrictions to online information can contribute to the gap forming the digital divide.

In this AI challenge with UNDP (United Nations Development Programme) and 50 technology changemakers built Machine Learning Models to identify the impact of technical restrictions and censorship on the human development index (HDI).

The Problem

The Human Development Index measures the well-being of countries. This index consists of dimensions like long and healthy life, knowledge, and a decent living standard. We think every country’s goal should be to raise the HDI and improve the country’s well-being and citizens.

But which role does connectivity play in all that?

Connectivity is access to information and communication technologies. Connectivity can improve people’s lives through access to information and education, but insufficient quality, breadth, or infrastructure can limit its availability.

And when people have access and a good connection to the internet, some governments still censor or prohibit the internet’s information.

We want to identify a relationship between technical restrictions, censorship, and HDI and how it affects connectivity and HDI.

We aim to answer these questions using Machine Learning and learn how this could contribute to the United Nations approach towards bridging the digital divide.

Image Source: Omdena

The flowchart above shows our process in analyzing the digital divide based on internet restrictions and limitations. Our analysis was divided into 2 parts:

  • Technical limitations
  • Internet Restrictions

Technical Limitations

Data and Model Exploration

The primary data source was The Inclusive Internet Index (3i) Data, an open-source dataset published by The Economist. The 3i index collects data from 100 different countries and predominantly includes data about accessibility, affordability, relevance, and internet readiness in these countries.

After preprocessing the chosen data, different regression models such as the Lasso, Ridge, Decision Trees, Random Forest, and XGBoost were investigated to predict the HDI values. Eventually, we chose the Lasso Regression model for its interpretability and high performance when generalized to a test set. This model can predict HDI values with an R-squared of 0.94 and an MAE of 0.02.

Feature Importance

Image Source: Omdena

Trimming the number of features to just the most important ones, the features with the highest predictive power were:

  • Internet users (% of the population).
  • Fixed-line broadband subscribers (per 100 inhabitants).
  • Total electricity access (% of the population).

The view of the relationship is in the graph above.

The visual indicates HDI values are higher with a higher % of internet users, electricity access, and fixed-line broadband subscribers. When combined, these three can potentially result in a high HDI value.

Can the data show us — How do lower-HDI countries improve upon their digital divide?

Image Source: Omdena

For lower-HDI countries, we believe that the first step is to improve electricity access since electricity will open doors to other technologies, such as the internet. And rural areas may be affected by the lack of electricity more than urban areas.

As we see above in the plot, many countries have relatively lower rural electricity access than total electricity access.

The plot shows the rural population’s percentage with electricity access (on the y-axis) versus the percentage of the urban population with access (on the x-axis). Countries that lie below the grey line have lower access in rural populations relative to access in urban areas. Nearly all lie below this line, meaning that electricity access in urban areas is higher than in rural regions for most nations.

Estimating the costs of closing the gap in rural electricity is tricky as several factors influence electricity access. It may be due to the lack of income or because there is simply no adequate infrastructure. The costs of building one would be highly dependent on an area’s terrain, accessibility, and other things.

Internet Restrictions

For this analysis, data was sourced from the Internet Freedom Scores, including information regarding internet censorship and restrictions across different countries. It turns out that internet censorships provide some predictive power towards HDI but would be better used to complement the infrastructure and technical limitations features noted earlier, instead of a standalone model.

Insights

Although the model didn’t perform as desired, we could still generate some valuable insights from the data. As seen in the graph below, countries with fewer internet limitations averaged a higher HDI than those with more limitations.

Image Source: Omdena

Further, we can observe that countries that restrict online activities less (lighter colors, scores 5–6) were associated with higher HDI values than those who did.

Image Source: Omdena

Another example would be restrictions on communication and encryptions. Similar to the situation above, countries that place the least restrictions on communication and encryptions (score 4) were associated with higher HDI values.

Image Source: Omdena

Solution

The two models above successfully fulfills the main objective of this task, to provide evidence that technical limitations and internet censorships have relationships with the HDI, and reducing either factor could result in increased development in a country.

The digital world has created a platform to expedite access to education, healthcare, transport, food, etc. Providing access to the internet could help increase the accessibility of essentials, resulting in advancements over various factors that ensure high development. Increased access to the internet content would enable you to make smarter decisions that could be driving forces to a developing country. Information like political news and social media are essential to gain insights on current affairs, and denying access to this information would prove to be detrimental to society.

Outlook

A significant factor to keep in mind is the state of the current world. The digital world has helped keep the world afloat during the Covid-19 pandemic by ensuring education, commerce, health care, food, essential items, and entertainment were still available in a world where people could not leave their homes. It’s hard to imagine not having these essential commodities; however, the truth is many may not have access to the internet. For this lack of internet availability, it would not be unwise to assume the drop in development would be exponentially higher in countries with low access to internet services and information. This pandemic has reminded us that providing access to the internet should be a priority in every country.

References:

https://theinclusiveinternet.eiu.com/explore/countries/performance

https://freedomhouse.org/countries/freedom-net/scores

More about Omdena

Source: Omdena homepage

Want to work with Omdena as an organization or AI engineer?

Learn more at www.omdena.com

--

--

Ahmed Jyad
Omdena
Writer for

Data Scientist | Basketball Enthusiast | Comic Book Geek