Use of semantic annotation in news monitoring explained
Published in
3 min readAug 10, 2017 service uses advanced natural language processing technology to analyze every news article entering its news feed. This involves several steps, one of which is semantic annotation — a process of computer-based recognition of key terms appearing in the articles. Semantic annotation is a way for the system to understand what the article is about, which subsequently enables the sorting and analysis functions that deliver value to the user of our news monitoring service.

Annotation of news articles is not new

Annotation of news articles is by no means a new idea. Virtually all news publishers currently tag their articles with keywords and assign them to specific categories (politics, business, sport, etc.) to allow quick identification of the topic covered and to make it easier to locate the article in the database at a later stage. The example below shows metadata in an article published by the Slovenian Press Agency.

For decades media publishers have resorted to manual annotation of articles. The main limitations to tagging articles by hand are speed (it takes additional time to enter the tags) and consistency. Two different authors may assign different keywords to an article discussing the same topic. An article about, for example, a doping scandal, may in addition to being a topic for the sports section also qualify as an article for the health or science sections of a news site.

Semantic annotation means automation…

Semantic annotation automates the process of tagging the articles. Systems using artificial intelligence are taught to recognize concepts appearing in the articles. This involves identifying people, locations, organizations and other pre-determined concepts in the articles. The example below shows a semantically annotated article (You can try on your own text on this demo page). sandbox for semantic annotation.

In the case of Event Registry, Wikipedia is used as the underlying knowledge base for recognizing concepts which appear in articles. Any concept that appears in the largest free online encyclopedia in the world can be recognized by the system in the news articles that enter the news feed. The semantic annotations can be currently provided for 100 most spoken languages in the world.

Automated systems improve speed and consistency of tagging. Terms appearing in news articles will consistently be understood by the system, eliminating human variance. On the other hand, a key challenge for semantic annotation is disambiguation — recognizing that a term can have different meanings (e.g. Chicago the city or the musical). Systems attempt to overcome this through the understanding of context. Semantic annotation is also an increasingly important feature for the world’s news media in their quest to utilize advanced technologies to optimize work processes, improve content delivery and drive additional traffic to their website. For example, semantic annotation can be used to create special thematic pages on news websites, where users can browse new and historical content on the same topic, and for promoting related content.



Easy to integrate news API that you can use in your products and services to obtain access to real-time as well as archive news content.