AI and Natural Language Processing and Understanding for Space Applications at ESA

Part V: Assisted evaluation of the innovation potential of OSIP ideas

Andrés García-Silva
6 min readNov 2, 2022

By José Manuel Gómez-Pérez, Andrés García-Silva, Rosemarie Leone, Mirko Albani, Moritz Fontaine, Charles Poncet, Leopold Summerer, Alessandro Donati, Ilaria Roma, Stefano Scaglioni

Image source:esa.int

This post is a brief overview of a paper that is currently under review in a journal (see preprint here) where we describe the joint work between ESA and expert.ai to bring recent advances in NLP to the space domain.

We have split the post in several parts:

Introduction

The Open Space Innovation Platform (OSIP) is the main entry point for novel ideas into ESA, both in response to specific problems and through open calls for ideas. Ideas submitted to OSIP are evaluated by a team of experts. The most novel, applicable and achievable proposals receive funding e.g. to support research at PhD or post-doctoral level co-funded by ESA and a host university, bootstrap early technology development activities, or conduct system studies.

In this case study we seek to support OSIP evaluators by providing them with contextual information about previously funded ideas, studies, projects and research work that are semantically similar to the idea under evaluation. In addition, we provide evaluators with a novelty score that quantifies how innovative an idea under evaluation can be. The rationale behind the score is that the lower the amount of similar previous work the higher the novelty of the idea.

Analysis

This case study also focuses on information extraction, However, here the main goal is to perform comprehension tasks related to understanding and comparing ideas with other ideas and previous work. Since the innovation score needs to be justifiable, the explainability aspect is also important. We did not have access to annotated datasets for model training, which in addition to the previous factors advised for a knowledge-based approach to address the language understanding challenges in this case study.

System Architecture

The architecture of the OSIP novelty evaluation service is depicted in figure 1. The import module ingests data from the different sources considered: i) the OSIP platform to extract ideas and campaigns; ii) the Nebula library containing studies; and iii) documentation about the projects funded by the FP7 and H2020 EC programs.

Figure 1. System architecture

The text that is extracted from such sources, corresponding to ideas, campaigns, studies, and projects, is sent to an instance of the expert.ai text analytics services hosted at ESA’s facilities (Cogito Discover) that extracts semantic metadata from it using the extended knowledge graph. Figure 2 illustrates the metadata extracted by the analytics service for one of the ideas submitted to the OSIP platform.

Figure 2. Metadata extracted from an Idea in OSIP.

The resulting metadata is then stored in an Elasticsearch index along with the original text of the document plus the metadata that was extracted from each data source. Next, the novelty of each idea is evaluated based on the similarity of the idea under evaluation with other ideas that were previously selected, implemented or archived and executed studies, as well as previous FP7 and H2020 funded projects. Finally, once annotated, the ideas are pushed back into OSIP, including the novelty evaluation score, its associated metadata, and the related documents.

Novelty score evaluation

We use a simple metric based on the intuition that if an idea is very similar to another idea, study or project the novelty score should be low. Also the other way around, ideas that are different from previous activities are considered to be more novel.

Unlike metrics for text similarity based exclusively on either lexical (keyword-based) or semantic similarity, we leverage all the information that has been previously indexed in Elasticsearch, which in our case includes both textual fields and semantic metadata extracted by the Cogito Discover module.

We focus the calculation of the similarity between an idea and other documents on the information indexed about main lemmas and main concepts. Other key terms are obtained from titles and descriptions by selecting the words with the highest TF-IDF scores.

The following Equation defines the novelty score of an idea i, based on its similarity (sim) with a collection of other ideas I, a collection of studies S, and a collection of projects P. We use Elastic search similarity which in our instance is set to BM25 similarity.

noveltyScore(i, I, S, P) = 100 ∗ (1 − max{sim(i, I), sim(i, S), sim(i, P)})

As explanation for the novelty score we provide evaluators with information about the similar ideas and similar projects we found. In addition, we provide the shared metadata between the idea under evaluation and the similar ideas and projects, and highlight in their description the shared metadata.

Figure 3 shows a screenshot taken from the production environment of the OSIP platform for an idea (with novelty score 75.1), for which the system has identified a slightly similar idea titled “Demonstration of radiation / thermal shielding with (small scale) inflatable gas tank + regolith sintered (with solar lens) structure”, previously funded by OSIP and highlights the semantic metadata both ideas have in common, as well as the similarity score between them.

Figure 3. Novelty score and similar ideas in OSIP

OSIP evaluators also have access to a graph visualization (figure 4) where they can easily see the most similar projects and ideas to the the idea under evaluation. Such graph is helpful to understand the research context of the idea and how it relates to previously funded research work.

Figure 4. Idea similarity graph visualization in OSIP

The idea similarity graph can be used to provide a high level overview of the content of the ideas submitted to the platform. Nodes in the graph are ideas or projects and an edge between two nodes exists when similarity between them has been identified. To identify the clusters of ideas we inject the similarity graph in Gephi, a network analysis tool, and apply the Louvain method for community detection. Once we have the cluster of similar ideas, we aggregate the concepts representing each idea and choose the most frequents as the representative concepts of the clusters. A visualization of the idea similarity graph is presented in the figure below.

Figure 5. OSIP similarity graph

The idea clusters we obtained from OSIP are presented in table 1. The risk of debris in the space is a topic that is highly represented in the ideas submitted to the platform. In addition, other topics such as the detection of plastics using satellites and AI, or mars and moon exploration via rovers are also very frequent. OSIP managers can tap into these clusters when planning the future campaigns to gather ideas around subjects that are not well studied, or to avoid funding ideas on topics that are already well covered. More critically some clusters might be seen as a forecast of the topics that could become trendy and important in the future.

About expert.ai

Expert.ai is a leading company in how to apply artificial intelligence to text with human-like understanding of context and intent.

We have 300+ proven deployments of natural language solutions across insurance, financial services and media leveraging our expert.ai Platform technology. Our platform and solutions are built with out of the box knowledge models to make you ‘smarter from the start’ and get to production faster. Our Hybrid AI and natural language understanding (NLU) approach accelerates the development of highly accurate, custom, and easily explainable natural language solutions.

https://www.expert.ai/

--

--