Creating a curriculum to help civil society become more data literate

In January 2017 I Hate Statistics was invited to organise an interactive workshop at the United Nations World Data Forum in Cape Town. We asked all participants which groups we should focus on and what data skills these groups should master. Here you find a summary of the collective output of the session. Let’s work together to help everyone become data literate!

Meaningful data requires interpretation, which requires data literacy.

In January 2017, the United Nations and Statistics South Africa organised the first World Data Forum. The goal was to bring together all kinds of organisations that will collect, analyse, share and interpret data about the 17 Sustainable Development Goals (SDGs), like improving access to clean water, ensuring more gender equality and getting rid of extreme poverty by 2030.

We believe data is a great way to keep track of our progress on these SDGs. The world has selected 17 important goals to reach. Now let’s collect and share the evidence to see how we are progressing and where we all need to focus our energies on.

But we believe just collecting and sharing the data is not enough. Data is valuable only if it is being interpreted correctly. To make data meaningful requires interpretation. This in turn requires adequate data literacy skills.

Me (Pim Bellinga, co-founder I Hate Statistics) presenting at United Nations World Data Forum on working together to help everyone become more data literate

That is why we had been invited by the UN to host a workshop. We decided to organise an interactive workshop, inviting everyone to contribute their input and collaborate to create a collective curriculum to make everyone more data literate. To all that have contributed their valuable input, thank you very much!

In our presentation below we present examples that illustrate why collecting and sharing data is not enough. It shows why need to spent our energy on ensuring adequate data literacy to interpret SDG data.

Here you can find the presentation that we gave at the first United Nations World Data Forum in Cape Town.

Who to focus on? (which groups?)

After showing the audience why data interpretation and data literacy skills are so important, we asked around when people will be interpreting the SDG data. It quickly showed that lots of different people will be interacting with the data on the SDGs. That indicated that it was time for the first break-out.

We asked everyone: list all the different groups that you can think of that will interact with the SDG data. In just five minutes, the participants wrote out 137 post-its, listing all kinds of groups. Below I used databasic.io to create a word cloud that gives you a feel for the different groups that the participants listed.

A wordcloud of all the groups that participants mentioned on the post-its. Here is a link to the analysis. This wordcloud has been created using databasic.io

As you can see, a lot of different groups. When we asked what kind of skills these groups needed (reading, interpretation, or checking data), the results varied widely. This indicated that we needed to treat these groups differently. But which group should we focus our energies on?

The workshop participants decided that there are three big groups we should focus on first:

  1. Policy makers and policy advocates
  2. Students and teachers
  3. Journalists and citizens

You can see the number of votes per group category below:

Most participants (61% of votes) believe policy makers and policy advocates are the first ones to focus on, helping them to become more data literate.

According to these votes, we divided the room into three sections. We asked everyone to go to the section with the group they voted on.

Now it was time for the next question: which concepts should people in these groups master in order to correctly interpret the SDG data?

What to learn? (which concepts?)

The participants again produced an incredible amount of post-its, listing 83 different concepts/skills. An average of 27 concepts/skills per group. These lists make up the curricula that the participants created collectively. We’re sharing the unfiltered lists below. As you will see, some concepts are listed for all three groups, while other concepts are deemed relevant for a particular group.

It’s a lot to take in. And to be honest: hard to summarise. I’m inviting everyone who think they can do a better job to please share their contribution in the comments. Here are my highlights:

  1. The groups require a different focus: policy makers/advocates on interpretation, students/teachers on having fun and working with data. Journalists on critically checking interpretations made by others and finding stories in data.
  2. The learning goals can (loosely) be divided into four groups: 1) understanding how data can help you 2) knowing how to think critically about data (mastering the core concepts) 3) knowing how to visualise/tell stories with data 4) working with data.

Draft Curriculum 1: Policy makers and policy advocates

  • Statistical significance
  • Data reliability
  • Conduct + trend analysis
  • Interpret data accuracy
  • Understand data interpretation + question others
  • comparing different data
  • interrogate trends + themes
  • data analysis skills
  • design tools for capturing the required data
  • present data in clear engaging manner to public, media
  • relative value
  • link data to spatial interpretation
  • understand absolute vs relative figures
  • identify authentic data sources
  • verbal + written communication about data
  • making databased educated policy decisions
  • consider needs/gaps deriving from data / derived populations
  • question or validate interpretation by others
  • understand the need for disaggregated data (eg gender) to policy formulation and prioritisation
  • understanding issues from grassroots level, collecting and interpreting data in order to address gaps and needs in society
  • interpret public opinion on policies
  • scrutinise how data applies to different demographies,identify short term vs long term trends

Draft Curriculum 2: Students & teachers

  • read graphs and understand how changing the axis can visually exaggerate differences
  • how to tell a fun and interesting story with data
  • identify inconsistencies
  • statistical correlations can be artifacts
  • data collection tools/techniques
  • data analysis tools
  • ethical considerations in data collection
  • research fundamentals, sourcing + interpretation + importance of data in relation to technical strategy, skills, advancement + budgets
  • understand probabilities
  • connecting data to students everyday lives
  • understanding bias
  • data mining / interpretation
  • question sample / collection of data (size, bias etc)turning question into studies > assumptions
  • data access
  • Train the trainer > train teachers and provide them with a support system ti enhance to data literacy skills of students
  • critiquing data viz
  • understand the power of qualitative datamaking data fun
  • going from data to story
  • using data to argue for change
  • making data relevant to students
  • understanding the data we generate as a byproduct everyday
  • learning data collection methodologies
  • gaps / risk analysis
  • read graphs / outliers
  • learn how to interpret the statistics from analysed data
  • importance of data sharing and proper data management
  • variation of concept of model + purpose communication of uncertainty risk (abs vs rel)
  • how data were collected + how to collect / source + types variation / visualisation + interaction / confounding factors + estimation + error

Draft Curriculum 3: Journalists and citizens

  • impact
  • sampling
  • critical thinking
  • domain expertise
  • Look beyond “sound bytes” > deep dives into complex issues
  • Basic understanding of major economic/social indicators (GDP, constant current value) unemployment, inflation, trade stats
  • ethics / confidentiality
  • importance of context, specificity of data
  • limitations of methods
  • percentages
  • changes (increase/decrease)averages
  • basic statistics calculations
  • median, average, variance, indices, %variation
  • understanding sampling, regression, weighted, stats offices basic ….research thoroughly
  • read carefully, listen intensely, think progressively, write proactively
  • correlation & causation
  • Bullshit detector
  • where to access data & how to combine different datasets (egt demographics / census / crime health )
  • percentage change vs percentage point (abs vs rel)
  • how to turn data into stories
  • find story in data
  • find data to backup stories
  • importance of sample size
  • data visualisation (a picture tells a thousand words)
  • referencing
  • mean / median / mode

How to achieve data literate societies?

The big challenge is how to help everyone in these groups become data literate. Depending on the group, we are talking thousands (policy makers), sometimes millions (students) of people. This is a massive challenge, one that requires the commitment and energy of a lot of different organisations and people. We will also need to combine all the best practices that have been developed. Both offline as online.

Have fun! One of the participants, Rahul Bhargava from the MIT Media Lab, made a notable point that the most important thing to do is to make data fun for people. We must first engage people, interest them in using data, create a culture of using data to investigate and using data to tell compelling stories. This is likely best achieved in offline workshops, tailored to the culture and situation of the target audience. The workshop by Rahul Bhargava and Natalie Shoup was a great example of how one can stimulate people to have fun and be creative with data and tell stories.

Scalable practice: in addition to fun, offline workshops, we know that data and statistics do require practice in order to really master the concepts. You can hear about sampling variation, but it often takes people multiple encounters before and deep immersions before they really grasp the concept. That is where we from I Hate Statistics can support. We have created an online learning platform, where we have helped thousands of students to practice working with data and data and statistics. 
Recently we also started SnapStat, where we are creating interactive, visual explainers to help journalists and citizens to better understand data and statistical concepts.

With our new project SnapStat we help journalists and citizens to better understand data and statistical concepts. Depicted is a screenshot from our first interactive explainer that we’re launching together with a journalist about sampling variation.

Let’s collaborate to help everyone become data literate!

As people have been fighting for centuries to help everyone become literate with words, we believe we must now do the same to help everyone become data literate!

If you want to help or are already working to help more people become data literate, please email us at hello@ihatestatistics.com and let’s see how we can work together.

Together we can help our societies become data literate!

Want to read more?