Pulse Lab Jakarta
Published in

Pulse Lab Jakarta

Adapting to Data-Driven Diplomacy with Machine Learning

Developed by Pulse Lab Jakarta in collaboration with the Indonesian Ministry of Foreign Affairs and the Ministry of National Development Planning, this machine learning based, data visualisation tool enables timely processing of thousands of documents from its global outposts to prioritise issues and highlight trends for diplomatic engagement.
Screenshot showing more detailed information in the form of a word cloud and a bar chart related to a specific focus area, in this case the protection of Indonesian citizens and legal entities of Indonesia (using sample declassified documents).
  • The scanned documents consist of unstructured text in a non-machine-friendly format, therefore requiring additional steps for conversion before classification;
  • The conversion process involves a number of steps: first, the scanned documents are converted into images, and then these images are converted into plain text. Inaccuracies often occurred in the conversion from image to text;
  • Some documents were up to 10-pages long, and the number of pages impact the conversion time from a scanned PDF to text;
  • Manually labelling the more than 5,000 declassified documents based on the the set of predefined categories was a very labour intensive task; and
  • Determining accuracy and relevance of associated keywords for the categories required human resources for assessment, instead of only relying on a count of frequent occurrence.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Pulse Lab Jakarta

Accelerating Analytic Partnerships for Development and Humanitarian Action