Mapping the geospatial patterns of broadcasted news

Jan Tschada
Geospatial Intelligence
6 min readJan 17, 2022

Mapping the geospatial patterns of broadcasted news allows a geospatial analyst to gain insights into common and unusual geospatial patterns. We decided using one of the most comprehensive news collection named “Global Database Events of Tone and Language” (GDELT) as the ground truth.

The creator of the GDELT project Kalev Leetaru wanted to construct a highly interconnected knowledge graph of human behaviour and beliefs. A massive knowledge graph stores the daily broadcasted news data across the entire planet. This knowledge graph contains news data from 1979 to present. Throughout the years, the data volume and frequency of news media increased exponentially.

GDELT Data Volume Chart

The geography of worldwide news coverage

Understanding the geospatial patterns of such a massive knowledge graph is often difficult for non geospatial experts. The machine learning algorithms extract articles and features from websites in real-time. The geocoding engine matches extracted locations against over eleven million well-known place names. An article mentioning a place like “New York” leads to an extracted feature location. But, the article does not have to be specific about the named location. For instance, an article regarding “The impact of the COVID pandemic on capital markets. “— mentioning “New York is no exception” — leads to a location match. We should expect some false positives, but the sum of all extracted locations should reflect a geospatial pattern and give us a coarse-grained overview.

Conflict and Mediation Event Observations Taxonomy

Originally, a National Science Foundation (NSF)-funded project on the study of interstate conflict mediation developed the Conflict and Mediation Event Observations (CAMEO) taxonomy. The project members took care to create a coherent and complete coding scheme describing all kinds of events. They reviewed existing coding schemes and redesigned actor specific event codes.

After spending three years of fine-tuning the coding schemes and coding dictionaries for events dealing with international mediation, they showed that their coding scheme worked well for studying political conflicts as well. The underlying ontology comprises round about 15,000 phrases. They aligned the entire coding scheme with the state-of-the art natural language processing community best practices, lately.

Collecting and importing the broadcasted news

We designed a simple cloud-native serverless architecture for collecting the broadcasted news. Every day between midnight and 1 AM, a timer based trigger executes a data collection workflow. The workflow starts by collecting the broadcasted news related to protests and demonstrations of the last 24 hours worldwide using the raw GDELT data files. The corresponding CAMEO codes match the pattern 14**. In a following step, we filter this data collection using only the most precisely extracted locations. We ensure that every extracted location is a match at the city level, or even more precise. This means a news report mentioning “demonstrations in Nevada” would not lead to an extracted feature match.

We import all extracted feature matches into a secured cloud-based hosted feature service. A feature service offers access to the underlying features using a standardized REST API. We registered an app having a unique client ID and client secret. The serverless functions use this client ID and secret getting write access to the feature service. Any other app and service must use a registered API key with read-only feature access.

Mapping the geospatial patterns daily

The broadcasted news related to protests and demonstrations of 17th October 2021 shows a hotspot in France. This day represents the 60th anniversary of the October 17th, 1961 massacre in Paris, when the French police suppressed peaceful demonstrators supporting the independence movement in their country. The massacre remains a shameful stain on France, because crimes against humanity do not expire. A statement by the Algerian Information Ministry stated that the Algerian demonstrators in France were civilians who were subjected to brutality, torture, and killing. The memory of the Paris massacre, when the demonstrators were killed and thrown into the Seine River for supporting the Algerian War of Independence, is still alive after 60 years. On October 17th, 2001, 40 years after the massacre, the Paris mayor erected a plaque at Pont Saint-Michel in remembrance of the lives lost.

Map of October 17th, 2021

On October 27th, Animal Rebellion supporters held a march where 7,000 people turned up to show their support. They sat on the pavement cheered, while more than a dozen police stood below with two technical advisors from the fire brigade in London. Half a dozen other Animal Rebellion protestors stood behind a police cordon and demand government support for a plant-based food system. The activists were in their 20s and 30s and hang a banner from buildings in London. The map shows the hotspot in London having 433 mentions from various news articles.

Map of October 27th, 2021

The Rotterdam police said they arrested 51 people during violent eruptions at anti-lockdown protests on November 20th. The officers were among the injured, with units from across the Netherlands dispatched to help restore order. At least two people were being treated in hospital after they were seriously injured. The officers issued an emergency ordinance in Rotterdam, shutting down public transport and ordering the protesters to go home. They also fired warning shots, because according to the police, the situation was life-threatening.

On the same day, a crowd of thousands of protestors marched through Oxford Street in London. They demanded the abolishment of covid passes. The demonstrations were part of a world-wide demonstration of over 160 cities, which is easily seen using the map of aggregated geographical features.

Map of November 20th, 2021

Inspecting the map at a larger map scale shows the underlying named locations as a feature layer visualized using a graduated renderer. The largest circle symbol represents locations with a count of over 250. On the 20th of November, the two hotspot locations are London and Rotterdam.

Map Named Locations of the 20th November

On December 24th, the number of published news article related to protests is very low. In the days before Christmas, many covid-related demonstrations have appeared. The geospatial aggregations and the extracted hotspot locations represent a coarse-grained geographical overview of protests world-wide. Mapping the aggregated features of the related news on December 24th allows a non-geospatial expert identifying this cold spot. The appearance of cold and hotspots is an important criterion for analysing time-based and/or seasonal data.

Map of the 24th December 2021

Accessing the geospatial features using the geoprotests API

The geoprotests API offer ready-to-use geospatial features representing broadcasted news related to protests and demonstrations. You can use these geospatial features to build various mapping and geospatial applications.

Every geospatial result support the GeoJSON and Esri FeatureSet format out of the box. All endpoints support an optional date parameter for filtering the results. For best performance, the serverless cloud-backend calculate the geospatial aggregations of the last 24 hours between midnight and 1 AM. The serverless functions save these geospatial features for the last 90 days and yesterday should be the latest available date. Without specifying a date, we have to calculate the geospatial features of the last 24 hours on-the-fly.

Summary

  • Mapping the geospatial patterns of broadcasted news by using the GDELT knowledge graph
  • Serverless cloud-based architecture collects and calculates the geospatial patterns daily
  • Getting a coarse-grained geospatial overview of protests and demonstrations of the last 90 days
  • Easy to use access using the geoprotests API hosted on Rapid API

See also

--

--