Harnessing Natural Language Processing to Decode Community Perspectives

FSN Network
FSN Network
Published in
5 min readAug 1, 2023

By: Nael Jean-Baptiste (Senior FSL MEAL Advisor, Save the Children) and Meghan Pollak (Nutrition Senior Advisor, Save the Children)

In the challenging landscape of conflict-affected regions, Save the Children is making strides to improve food security and bolster community resilience in northern Mali. With that goal, Albarka, a five-year Resilience Food Security Activity (RFSA) funded by USAID’s Bureau for Humanitarian Assistance (BHA), is designed to strengthen local systems and foster community participation. Central to the program’s objectives is the enhancement of feeding practices for vulnerable populations, including infants, women, and youth, with the aim of mitigating the impact of food and nutrition security shocks. To gain valuable insights and actionable information, 120 focus group discussions (FGDs) were conducted with community members, exploring strategies to promote favorable behaviors related to dietary diversity, exclusive maternal breastfeeding, handwashing, household nutrition dialogue, and sanitation.

Navigating Challenges: Ensuring Quality Information Analysis

Confronted with an abundance of FGDs and a vast volume of resulting data, the program faced a fundamental question: How can the quality of information analysis be guaranteed? The subjective nature of FGD analysis, influenced by personal beliefs, academic backgrounds, and experiences, can introduce biases. Additionally, selective focus on certain aspects may lead to an incomplete and biased representation of the original data.

Overcoming Challenges: The Power of Natural Language Processing

To address these challenges head-on, the program adopted a hybrid approach, combining a paper-based data collection system with Natural Language Processing (NLP), a branch of artificial intelligence (AI). The chosen method of analysis was the frequency-based summary extraction of FGDs. The process involved a facilitator guiding each focus group using an interview guide, while a note taker diligently recorded responses in a notebook. These notes were subsequently transcribed into an Excel matrix, with each focus group assigned a row and responses organized into columns. The aggregated text underwent preprocessing, filtering out irrelevant words, and sentence and word tokenization using an NLP algorithm. This generated an initial output in the form of a “bag of words,” from which a summary was crafted by selecting the most significant and frequently occurring phrases or expressions. This approach relied on the assumption that critical information is often repeated, resulting in a concise summary capturing the essence of the original content. Notably, these summaries highlighted barriers and small doable actions (SDAs) that were frequently mentioned by community members.

Graphic showing the paper-based and NLP hybrid approach process, featuring five steps: FGDs transcripts, Excel data entry matrix, text, frequency-based NLP algorithm, bag of words, and text summary.
Paper-based and NLP hybrid approach (Photo Credit: Albarka)

Leveraging FGD Summaries for Actionable Plans

The Albarka Nutrition Team utilized these concise summaries to validate SDAs and developed a comprehensive workplan to serve as a roadmap for implementing new activities through local structures.

For example, through the FGD summaries, it became evident that a lack of knowledge and financial resources hindered efforts to diversify diets. Among the array of SDAs mentioned by the community, two recurring themes stood out: sensitizing men to support their wives in diversifying diets and promoting vegetable gardening activities for women. These actions were deemed practical and attainable by the community. As part of the program’s planned innovative initiatives, the Groupe de Soutien Aux Activités de Nutrition (Nutrition Support Groups, GSAN) members would facilitate discussions on household diet diversity within husband schools, aiming to encourage men to actively participate in supporting women caregivers during the critical first 1,000 days of a child’s life.

Men and women participate in a validation session. Three women are holding children on their laps, one man is writing on a notepad, while another woman is conducting an interview with the group.
Validation session (Photo credit: Souleymane Arouwani)

Evidence also showed that women do not have time to exclusively breastfeed children under six months due to lack of time and “stress”. Both male and female focus group members cited increasing husbands’ support with household chores to alleviate the burdens on women, which would increase their time and motivation to exclusively breastfeed. The Albarka team plans to promote this small doable action and create an enabling environment for discussion among husbands during Husband School meetings on how they can rearrange their days to support their wives/female family members.

Lessons Learned: The Value of Text Summarization Techniques

The utilization of frequency-based text summarization techniques for analyzing FGDs provided several quality criteria. Firstly, the extraction of important sentences directly from the original FGD text ensured reliability. Secondly, the approach minimized bias by reducing variations introduced during the summary process, thereby guaranteeing consistency in the extracted information. Thirdly, the subjectivity and biases stemming from personal prejudices, academic backgrounds, experiences, and cultural influences were mitigated, promoting objectivity in the analysis. Moreover, the automated nature of NLP saved significant time and incurred no additional costs, thanks to the utilization of Python, an open-source programming language widely used in the field of data science and machine learning. Additionally, the clear documentation of the analysis process allowed for the verification and replication of findings, ensuring transparency and accountability.

While integrating NLP techniques for data analysis proved valuable, the experience highlighted the importance of addressing inconsistencies throughout the data collection and input phases. Inconsistencies in formatting, spelling, font size, and sentence structure were identified as potential pitfalls that required careful mitigation. To overcome these challenges, it is crucial to establish clear guidelines and provide comprehensive training to interviewers, ensuring accurate and consistent data recording in notebooks and data entry masks. Furthermore, employing an online proofreader to review the generated text before running it through the algorithm emerged as a best practice, minimizing errors, and enhancing data integrity. Last and not least, another important lesson learned is that it’s always a good practice for the technical team to validate the data and further discuss trends and nuances to ensure Small Doable Actions make the most sense for promotion during community-level activities.

Unlocking New Venues: NLP for Formative Research

The use of NLP techniques for abstract generation is not limited to Focus Group Discussion. In fact, it can serve as a valuable tool in formative research, particularly during the Refine and Implement (R&I) phase of programs funded by the USAID Bureau for Humanitarian Assistance. By leveraging extractive summary algorithms, program implementers can gain the ability to process text from a wide range of sources, including web pages, HTML, PDF, Word, Excel, CSV, plain text, XML, and response from Application Programming Interfaces (APIs). This can empower information gap analysis effectively and efficiently, enabling informed decision-making and the design of targeted interventions that address the specific needs of the communities.

Get Involved

Do you have an example of how your organization has integrated community perspective in programming or used NLP for analysis, informed decision-making, and learning? We’d love to hear about it! Send them to njean@savechildren.org.

--

--

FSN Network
FSN Network

We engage the food security community to share knowledge and resources to support vulnerable households worldwide. Privacy: www.fsnnetwork.org/privacy-policy