AIDA — Artificial Intelligence for Development Analytics

UNDP’s new tool can help experts better understand their data

SDGCounting
SDG Counting
7 min readJun 8, 2023

--

Can an organization like the UN Development Programme (UNDP) have too much data? Probably not, but experts within UNDP’s Independent Evaluation Office (IEO), the group tasked with conducting objective project evaluations, were quickly realizing that the more reports they had access to, the harder it was to extract timely and relevant information simply due to the work required to process the information.

Their solution? AIDA, which is shorthand for Artificial Intelligence for Development Analytics. This tool, developed in partnership with the UN International Commuting Center and AWS, utilizes machine learning to provide streamlined access to detailed information from thousands of unstructured documents.

We originally referenced this technology in our recap of innovations at the UN World Data Forum. In the following article, we break down the tool in detail, explaining where it came from, what it does, how you can use it, and what to expect from it in the future.

More Data Can Mean More Problems

The United Nations, with its 30+ affiliated organizations and tens of thousands of employees including experts in every field imaginable, is known for its prolific ability to create documents. UNDP’s Evaluation Resource Center (ERC), which tracks progress and implementation of worldwide development projects, contains over 6,000 documents on its own in a variety of file formats and report structures. With hundreds of thousand of pages of analysis available just for a small segment of the UN’s work, it regularly takes days of research just to understand successes and failures in previous similar projects.

The knowledge contained in myriad technical documents — many being hundreds of pages long and requiring weeks of research to produce — is only useful if it is being accessed. According to a World Bank study of its own documents, 31% of policy reports are never downloaded and 87% are never cited.

Why is all of this knowledge going untapped? The short answer is analyzing unstructured data takes too much time. It is not uncommon for the most important conclusions of a document to be buried deep in the text of a PDF. This requires detailed searching by hand and can often miss important findings found in tangentially related reports.

Recognizing the value of the information available, along with the challenges of working with data in diverse text-based documents, the staff at IEO set out to develop a new tool to make analysis easier. To do this, they leveraged the power of machine learning to extract, categorize and label the data found in their thousands of official documents.

What is truly groundbreaking about AIDA is its capacity to intelligently search and make sense of unstructured data from over 6,000 evaluation reports. AIDA can analyze this wealth of information right down to the paragraph level, pulling out precisely what is needed from vast swathes of information in seconds.
-Oscar A Garcia, UNDP/IEO

How Does AIDA Work?

While the specifics are quite technical, the concepts driving AIDA are straight forward.

First, reports are imported and converted from PDF or DOC into plain text. These text sources are further broken down into smaller components such as paragraphs and sentences. Using a variety of machine learning algorithms, these smaller elements are classified to specifically identify if they contain any findings, conclusions, or recommendations. They are also given thematic labels to aid in future searching. Throughout this process, a “human-in-the-loop” approach is employed wherein a trained individual checks the progress and gives feedback to further refine and improve outputs.

Once these thousands of unstructured documents have been ingested, analyzed and labeled, they can then be searched in more powerful ways. UNDP has developed a web portal to make this easy. Rather than having to wade through hundreds of documents that a researcher may judge to be potentially useful based on a title, a user can simply search for key themes, returning specific relevant information that can be exported and then analyzed.

In practice, AIDA feels like a simple search engine. But under the hood it is providing access to a much more complicated and powerful system of identifying important topics. Its power comes not in its ability to find phrases but rather in its ability to identify themes, and more importantly, summative statements.

For those wishing to dive deep into the technical underpinnings of this tool, we highly recommend the following article from AWS, one of UNDP’s partners on this project:

How Can I Utilize AIDA?

Simple. Head over to the AIDA Landing Page and give it a try yourself:

AIDA: Artificial Intelligence for Development Analytics

The UNDP has created a short tutorial you can view to quickly bring you up to speed on the interface:

It should be noted that AIDA is still primarily a text-based tool. It is designed to help researchers find relevant information from written reports. It is not designed to produce the type of tabular data one might use to track something like HIV infection trends over time. We recommend resources like the SDG Global Database for data like that.

By efficiently locating relevant information, users can quickly draw conclusions about work that has been completed previously.

AIDA has already proven successful in facilitating both inductive and deductive analysis approaches. Through the inductive approach, the system has facilitated the identification of emerging insights from an evidence base. Through the deductive approach, the system has facilitated the identification of evidence supporting previous statements.
-IOE

What’s Next in AIDA’s Development?

The features of AIDA described so far are just the Phase 1 implementation according to the IOE. Phase 2 development is currently underway and there are several improvements we can expect to see in the near future.

Increased Knowledge Base

The AIDA team plans to expand beyond the current corpus of only ERC materials to include information from other sources. Due to the need to vet reports that are ingested, this will likely begin with other UNDP documents before expanding to other UN affiliated agencies.

Algorithmic Improvements

Over the coming months we can expect to see improvements in the data pipeline from ingesting documents to extracting and labeling information. These incremental improvements will continue to increase the usefulness of the tool and the results.

Addition of Sentiment Analysis

It may be useful to quickly extract every conclusion and recommendation related to a program such as direct cash transfers, but how can we quickly learn if these approaches have been effective? By automatically classifying these textual conclusions as positive, neutral or negative, researchers can get an immediate feel for the findings.

Translation

Currently 20% of UN documents are in a language other than English. UNDP is currently working on adding translations tools to AIDA so non-English documents can also be searched and utilized.

Summarization and Insight Generation

At this point AIDA’s focus has been on information identification. It is still up to researchers to interpret the data and draw their own conclusions. The development team is exploring options to automatically summarize returned results, identify previously unknown thematic connections, and generate key insights. Given the recent impressive outputs we have seen from tools like ChatGPT, this seems like a natural next step. However, that brings us to our next point.

Continued Ethical Considerations

While tools like ChatGPT have shown themselves to be adept at tasks like summarizing documents, they are not without their faults. For instance, when an algorythm is determining what is important from a list of conclusions, the risks of bias (known or unknown) cannot be overlooked. Likewise, these tools have been known to hallucinate conclusions. This may not matter much if you are using ChatGPT to write a poem about geometry proofs, but when it comes to international development, it is a risk that cannot be ignored

For this reason, the developers of AIDA are continuing to invest time and resources into ensuring any AI tools are being used ethically.

API Implementation

A final technical tool of note that may be in the development pipeline is the development of an API that would allow users to build their own tools around this technology.

Learn More

All of the Phase 2 improvements mentioned above were discussed at the 2023 UN World Data Forum. If you want to learn more about AIDA and its development, we highly recommend you check out the stream of the session:

Conclusion

AIDA represents an innovative leap forward in managing the massive volumes of information the UN generates. By utilizing machine learning algorithms to effectively analyze, categorize, and label data within thousands of diverse documents, AIDA is poised to significantly streamline the process of extracting meaningful information from this expansive knowledge repository.

While already a powerful tool in its current form, AIDA’s future developments promise even greater usefulness going forward. Our team looks forward to following the progress and encouraging the development of other similar tools that help harness the power of data to advance international development.

Sources

The following resources were consulted in the creation of this article. We have included them so you can dive deeper into this emerging technology.

Official Links

Articles

Multimedia

SDGCounting is a program of StartingUpGood and tracks the progress of counting and measuring the success of the SDGs. Check us out on Twitter.

--

--

SDGCounting
SDG Counting

Keeping track of progress on trying to count and measure the success of the Sustainable Development Goals.