StoryLab Academy: Data-driven Journalism with Lam Vo

BuzzFeed Fellow, Lam Thuy Vo hosted a Social Media Mining Data-journalism Masterclass in Nairobi

In late July of 2017 in Nairobi, a group of Kenyan journalists, journalism students, techies, and myself (the token layman) had the honour of being under the tutelage of Lam Thuy Vo (who is an Open Labs Fellow for BuzzFeed) in two Masterclass sessions sponsored by Code for Africa as part of the just launched StoryLab Academy program. The program, which was launched in June 2017, aims at, among other things, training journalists on how to incorporate data tools into their storytelling as a means to steer newsrooms into the 22nd century and beyond.

Data-driven journalism is a term used to describe a journalistic process based on analyzing and filtering large data sets for the purpose of creating a news story. Main drivers for this process are newly available resources such as open source software, open access publishing and open data. [Source: Wikipedia]

The Masterclass: Day 1

When you meet Lam Vo the first thing that strikes you about her is the sheer enthusiasm she has for whatever it is she’s talking about. From videography, storytelling, writing code, her love for cat gifs, to data; she tells every story with the passion of a natural educator.

This, and her very helpful online tip sheets made it easier for laymen, such as myself, and those new to data-driven journalism to follow the lessons.

We were eased into the Masterclass with a comprehensive introduction into multi-platform storytelling which laid the groundwork for how journalists can distill large volumes of data into bite-size pieces for their stories using methods such as quick & easy video shooting techniques, charts & maps (data visualisation), and movie-like action timelines.

Journalist Joseph Kobuthi weighs in during the session.

Lam took us through various ways to use layman-friendly formulae in Microsoft Excel and Google Spreadsheets to clean, format, merge, and analyze data for reporting.

All this with interesting cat gifs along the way, of course! (BuzzFeed staff are slightly obsessed with cat gifs).

At the end of the first day’s 2 hour session our assigned homework was to come up with a list of Kenyan politicians on Twitter and their Twitter handles. We were going to mine their Twitter timelines for data.

The Masterclass: Day 2

This is when things got technical.

We were introduced to coding and the myriad ways programming code can be used to create tools that, to quote Lam, “are relevant to journalistic storytelling.”

Data journalism is a journalism specialty reflecting the increased role that numerical data is used in the production and distribution of information in the digital era. It reflects the increased interaction between content producers (journalist) and several other fields such as design, computer science and statistics. [Source: Wikipedia]

Lam does her coding in Python, Ruby, Java, HTML and CSS programming languages. Now, to a layman’s eyes, computer programming code looks something like this:

“The code is hidden in tumblers”

Fortunately, if you choose to believe it, coding is not difficult to learn. As Lam herself explained, one can learn how to code in a relatively short time as she herself did, moving from novice coder with zero experience in programming to intermediary level sensei with her own advanced coding course available online on the free repository space GitHub.

Basic coding in Python. Simple, no?

It was at this point in the workshop that we submitted our previous day’s homework like the good students we were. Lam then selected several Kenyan politician’s Twitter handles and proceeded to input them into a Python script that was able to fetch a limited number of tweets* posted from said Twitter accounts and dump them into a .csv file (think spreadsheet) for analysis.

*Twitter only allows access to a users most recent 3240 tweets with this method

Our resident techie/coder Chege was in his element at this point as he shared on some of the open data projects he is working on at Code for Africa with applications in Kenyan media houses.

At the end of the second day’s session I felt confident that Kenya is on track to becoming a big player in the data journalism field. The journalists and journalism students who attended the Masterclass were buzzing with ideas for how to take their storytelling to the next level with data and visuals.

Lam with journalism students Ruby Abuor (L) and Soila Kenya (R) (Photo: Soila Kenya)

Lam Thuy Vo is a reporter and journalism educator working with BuzzFeed on their Open Labs program. She has worked with data and multimedia for publications such as The Wall Street Journal, NPR, and Al Jazeera America.

You can view some of her work here and some of her work with BuzzFeed here.

Kwasi Gachie is a Program Officer at Code for Africa, working with the pan-African Code for Africa Citizen Labs and StoryLab Academy teams in 12+ countries.

Special mention to chege james for the technical support.

About the StoryLab Academy: Journalism is evolving so quickly that newsrooms are struggling to keep up. The StoryLab Academy seeks to help, by bringing face-to-face training into partner newsrooms across Africa, and by hosting public workshops at monthly Hacks/Hackers meetups.

The Academy offers online courses and webinars, designed to teach just one tool or technique at a time, so participants can upgrade their skills at their own pace.

The training is spearheaded by the continent’s largest digital journalism network, Code for Africa, with support from the Google News Lab and World Bank.


Code for Africa (CfA) is the continent’s largest federation of data journalism and civic technology laboratories, with labs in four countries and affiliates in a further six countries. CfA manages the $1m/year and $500,000/year, as well as key digital democracy resources such as the data portal and the election toolkit. CfA’s labs also incubate a series of trendsetting initiatives, including the PesaCheck fact-checking initiative in East Africa, the continental africanDRONEnetwork, and the African Network of Centres for Investigative Reporting(ANCIR) that spearheaded Panama Papers probes across the continent.CfA is an initiative of the International Center for Journalists (ICFJ).

Google News Lab empowers the creation of media that improves people’s lives. It’s mission is to collaborate with journalists and entrepreneurs everywhere to build the future of media with Google. It does this through product partnerships, media trainings, and programs that foster the development of the news industry as a whole. Google began its support for digital and data journalism in Africa in 2010 through intensive workshops and continues to offer newsroom-targeted trainings. It also supported innovateAFRICA’s predecessor, the African News Innovation Challenge, in 2012.

The World Bank Global Media Development Programme helps the media leverage digital technologies to strengthen its role as a driver of good governance. In Africa, this has included support for data-driven journalism training starting in 2011, as part of efforts to improve the media’s analytical capacity. The World Bank also works with African governments to help make data for decisionmaking on development and economic issues more easily available to citizens and the media. The World Bank’s support has included co-funding for the to build statistical capacity and data literacy amongst journalists, as well as support for the HURUmap initiative to make census and demographic data more easily available to African newsrooms.