Creating Data Visualizations — Expected Years of Schooling Per Gender in 2005

Dania Yamout
Data and Society
Published in
7 min readMar 14, 2018

By: Dania Yamout

Retrieved from https://www.slideshare.net/saikiranhrinarthini/gender-inequality-in-india-70202655

I have produced three data visualizations through the Infogram website for the purposes of taking a deeper look into how the legal equality of women in the world today is compromised.

The topic I decided to tackle is the way that gender inequality of women is compromised, specifically when it comes to education and the number of years that girls attend school. I decided to take all of my data from the World Bank’s Gender Data Portal ( http://datatopics.worldbank.org/gender/). Attempting to take a global outlook into this issue, my sample has been dissected into seven aggregate world regions which are: North America, Middle East & North Africa, Sub-Saharan Africa, South Asia, Latin America & Caribbean, East Asia & Pacific, and Europe & Central Asia. To keep things simple I decided to make my time sample one year; the year 2005.

As a starting point, I decided to compare the expected years of schooling for female education through these world regions in comparision with male education to see the differences and possible discrepancies. Not only to come up with better comparison for visual purposes, and to make the differences more easily visible.

The Process:

I used the World Bank website as the primary source for my data gathering. Through a series of search parameters, used to provide me with a downloadable spreadsheet, with data concerning the expected years of schooling for females, expected years of schooling for males, per region, and all for the year 2005.

The data collated provided the following spreadsheet:

Retrieved from http://datatopics.worldbank.org/gender/

This in turn was revised and cleaned up:

Then it was further revised, for the purpose of inputting the data into Infogram for the purposes of creating the data visualizations:

As we can see from the following data, there is a rather happy surprise in regards to the data from North America, and Europe & Central Asia; which actually has gender inequality that is skewed towards females rather than males. Latin America & Caribbean and East Asia & Pacific, appears to have perfect gender equality when it comes to schooling of both genders. While at the other end of the scale, the Middle East & North Africa, Sub-Saharan Africa, and South Asia still have some ways to go to ensure better gender equality when it comes to the amount of education provided to females.

The following metadata was used:

The following are screencaps of the metadata provided by the World Bank site, after the data was provided after setting the parameters needed. The metadata is important in order to provide some much needed background information, concerning the data provided by the World Bank. Please note that there is more information here, than can be squeezed onto the following screen captures.

Metadata for Expected Years of Schooling, Female:

Retrieved from http://datatopics.worldbank.org/gender/

Metadata for Expected Years of Schooling, Males:

Retrieved from http://datatopics.worldbank.org/gender/

The above information was then used as a foundation to create the following data visualizations on Infogram:

Data Visualization 1A: Expected Years of Schooling per Gender in 2005

The above data visualization was created using the stacked bar chart on Infogram. It was created using the following data:

Data Visualization 1B: Expected Years of Schooling per Gender in 2005

The above data visualization was created using the bar chart on Infogram. It was created using the following data:

Data Visualization 1C: Expected Years of Schooling per Gender in 2005

The above data visualization was created using the line chart on Infogram. It was created using the following data:

As you can see, these data visualization are very clear and effective in displaying the inequalities in the expected years of schooling between the genders. However, these charts are not perfect and can sometimes leave important information out.

While creating these data visualizations, I had to test a number of different charts in order to see which is the most effective in my data in the most accurate way possible. In Data Visualization 4 below, you can see where the limitations in this process can be problematic:

Data Visualization 1D: Expected Years of Schooling per Gender in 2005

At first glance, Data Visualization 4 looks great. The small dots are neat, colorful, and easy to understand. However on closer inspection, one can see that this dot chart does not account for the circumstance when the numbers are the same, or to put it more bluntly, the genders are equal.

The color of the dots depicted by Latin America & Caribbean, as well as East Asia & Pacific are different than the purple color in the chart’s legend that is used to depict females, and the green used to depict males. Instead there is a new category, which can be called equality. This new category, had to be created in order to account for circumstances in which the data for both females and males are equal. In the equality category, depicted by a new greenish-grey color, created by Infogram; shows equality in the number of school years where it is the same for both males and females.

I created two more data visualizations based on two other factors that portray gender inequality when it comes to education.

Data Visualization 2: Progression to Secondary School (%) per Gender in 2005

Similar to how Data Visualization 1A was created, data was provided from the World Bank site, which was then downloaded into a spreadsheet, cleaned up, and used to extracted from the spreadsheet, cleaned up and used to create the data visualization in Infogram. All this was done, in order to depict the percentage per gender that progressed to secondary school in 2005 through the different world regions.

This includes the following metadata for both sets of data:

Metadata for Progression to Secondary School (%), Female:

Retrieved from http://datatopics.worldbank.org/gender/

Metadata for Progression to Secondary School (%), Male:

Retrieved from http://datatopics.worldbank.org/gender/

Data that was used to create Data Visualization 2:

Data Visualization with Supporting Data in Infogram:

Data Visualization 3: Children out of Primary School Per Gender in 2005

This includes the following metadata for both sets of data:

Metadata for Children out of Primary School, Female:

Retrieved from http://datatopics.worldbank.org/gender/

Metadata for Children out of Primary School, Male:

Retrieved from http://datatopics.worldbank.org/gender/

Data that was used to create Data Visualization 2:

Data Visualization with Supporting Data in Infogram:

The three data visualizations depict the gender inequality when it comes to education. By comparing the major seven world regions, we are able to better view this gender inequality that took place in 2005; and better understand factors that contribute to this inequality such as expected years of schooling, percentage of students progressing into secondary school, and the number of children that are out of primary school.

By focusing our attention to specific regions of the world that improvement, these visualizations can help make more informed decisions in order to improve this inequality. This can mean more resources by the World Bank and other relevant institutions may be needed.

As we can see, while data visualizations can be extremely effective when used to provide potentially complex data in a visually attractive way. However, one must also be aware, that just as data visualization can be extremely effective in showing what is visible, one must also be aware that they do have limitations, and may not always be telling us the full story.

--

--