Data Literacy —The ABCs in the Age of Data (Part I)

Word cloud of feedback for the Data Literacy ePrimer, an eLearning course we designed

Getting someone to navigate and understand the complexities of what Data Science and AI really is, can be difficult in the face of the hype surrounding it.

We started out with a simple ambition to help public officers understand more about Data Science & AI (“DSAI”). By the end of our journey, we created an eLearning course with about 4 hours of videos, diagrams, and explanations, teaching a broad range of topics from DSAI use cases, importance of data, visual analytics principles, Machine Learning as well as a framework on how to scope a data project. When we received positive reviews and feedback for the Data Literacy ePrimer, we knew that the effort over the past 1+ year was well worth it.

Here’s a look back at the development journey of how we built up the content, and hopefully inspire our public officers reading this to pick up the ePrimer. We believe it is important knowledge that will serve you well going into the age of data.

The Problem

The Data Science & AI Division at GovTech has been collaborating with public officers across various agencies on projects for a few years. We have a thriving community of enthusiasts and practitioners which we engage with regularly as well. Over the years of project collaboration and community engagement, we slowly developed a sense of what the common misconceptions about data are, and what baseline knowledge, we think, would really help raise the overall understanding about data for the public service as a whole.

We call that pool of knowledge data literacy. Unfortunately, this body of knowledge is not something you learn about comprehensively in most data-related courses out there.

Based on our framework on data training, both Data Literacy and acquiring applied analytics skills are equally important.

You can probably attend half-day seminars about how cool Data Science & AI is, but you may not gain a deeper appreciation of why good data is of paramount importance.

You can attend 3-day courses on advanced analytics features in Excel to work with real data, but you won’t learn why some charts and dashboards just help people understand the presented insights better.

You can also attend 5-day courses on visual analytics principles and how to use tools to visualise data, but you will not be able to peel open the black box of how machine learning helps you predict outcomes.

How many of these courses can a person realistically attend? Going for one or two of such courses are often inadequate to understand the full picture behind what makes Data Science & AI useful.

And it’s no wonder why management can’t reconcile why their “trained” staff did not come back with the right skills to perform magic with data.

The Need

Collectively, the team has delivered more than 30 talks at government agencies to share about Data Science & AI over the years. As we see an increasing trend of such requests from agencies, we know that this is not sustainable in the long run.

It is easy to convince people that Data Literacy primer sessions are important. After all, there is high demand, and the feedback has been very positive. The harder part was trying to convince people that we should invest resources to build a course with our own hands. The arguments against building our own content are often variants of “but there’re so many learning materials out there. Why not curate some of the good articles and videos and point people to them?” Or “do we think we can do a better job than what is already readily accessible?”

Well, there are a lot of free <insert topic here> tutorials on YouTube. Will simply pointing people to all the different videos create more experts in <topic>?

The truth is, while there are a lot of learning contents out there, hopping all over the place to learn is simply a bad experience. Stitching together videos and articles to create a “Frankenstein course” is a sure way to turn people off. The technical jargon used differently across platforms can be confusing. The inconsistencies in aesthetics and style of the learning contents can also be annoying. Notably, the context and examples are seldom public sector-related, which makes it harder for public officers to relate to.

So we knew we had to do something ourselves. With the support from our bosses, we began our work. Along the way, we learnt about similar efforts elsewhere in the world, thus reinforcing our resolve that we had to get this done quickly, and well. For example, Finland rolled out an online AI course (Elements of AI) with partners to help people understand AI better.

Photos of a textbook on AI for high school kids in China. Shown here is performing classification using machine learning.

China had AI textbooks for high school kids. They even have similar books tailored for kindergarten kids! Some time after, I chanced upon one such book and saw charts and diagrams explaining how machine learning works. I thought to myself then, that if kids and youths are already learning this in China, what excuse do adults, who are handling way more data in our day-to-day professional work, have?

The Goal

We quickly settled on the high level goals for the Data Literacy ePrimer that we wanted to develop:

  1. Easy and non-intimidating. No experience required, no maths, no codes.
  2. Casual and fun. No serious business talk, interesting analogies to explain complex terminologies, peppered with some light-hearted jokes.
  3. Short and compartmentalised. Everyone is busy, so keep overall length to just half a day (~4 hours). Break up the course into coherent modules. Then break them up further into bite-size sections.
  4. Handy and accessible. Access the online course easily on the go. Learn when you have time, stop when you are busy.

With those lofty goals in mind, the next order questions were these:

  1. The Frame — How do we frame it up into what is absolutely needed, when there is so much content to deliver?
  2. The Delivery — People are busy and we’re competing for their time. How do we present the content better so they REALLY want to try it?

The Frame

With the content of the primer talks we were delivering at the various agencies, and the vast experience built up from engaging public officers on data projects and community activities, framing the delivery of the course turned out to be an easy decision.

We quickly settled on the following 5 modules, each tackling an important aspect about understanding Data Science and AI better:

  • Module 1 — Clarify the jargon and terminologies (Data Science, AI, analytics). Inspire people about use cases where DSAI worked well. And don’t just show the good stuff; remind them of use cases where DSAI failed.
  • Module 2 — Highlight the importance of data. Why is clean, high-quality data so important? Why do Data Scientists spend so much time preparing the data?
  • Module 3 — Appreciate visual analytics principles and why we should all apply them to the charts and dashboards we built.
  • Module 4 — Gain an intuition about Machine Learning algorithms. Why do they work the way they do? In what situations do they not perform as well?
  • Module 5 — Inspire them to get started on data projects. Present them with a Project Scoping framework and methodology that they can use to get started immediately.

Attained bosses’ support … good to go. Resources … secured. Goals … checked. Framing of the contents … done. We seemed to be on the highway to success!

Little did we know, that we’d soon discover that we had grossly underestimated the effort required to complete this…

To be continued in Data Literacy — The ABCs in the Age of Data (Part II)

Excited to find out more? See what some people had to say about our Data Literacy ePrimer too before heading over to Part 2:

--

--