AI and Natural Language Processing and Understanding for Space Applications at ESA

Part I: A methodological framework to develop NLP-based applications for space

Andrés García-Silva
10 min readNov 2, 2022

By José Manuel Gómez-Pérez, Andrés García-Silva, Rosemarie Leone, Mirko Albani, Moritz Fontaine, Charles Poncet, Leopold Summerer, Alessandro Donati, Ilaria Roma, Stefano Scaglioni

Image source: esa.int

This post is a brief overview of a paper that is currently under review in a journal (see preprint here) where we describe the joint work between ESA and expert.ai to bring recent advances in NLP to the space domain.

We have split the post in several parts:

The first part (this one) focuses on the methodology we propose in order to build NLP applications in space. The other four parts illustrate the application of such a framework through four real-life case studies.

The collaboration between expert.ai and ESA started in 2015, in the framework of the Horizon 2020 project EVER-EST coordinated by ESA, and has continued to date in the context of several projects funded by ESA where the impact of NLP/U and Text Mining has been evaluated in different areas of the organization. In 2019, ESA’s Corporate IT team acquired expert.ai’s technology to enable the extraction of information from documents in repositories and databases across the organization. Expert.ai text analysis has been applied in different areas within ESA, including ESA’s Long Term Data Preservation strategy and the Open Space Innovation Platform.

Introduction

Handling the wealth of knowledge produced throughout the life cycle of ESA space missions is a huge endeavor. Natural language processing and understanding play an important role to leverage all that information across different functional areas like space mission design, quality management, long- term preservation and innovation management.

In this post we present a methodological framework to automatically extract information and enable machine understanding of space documents. We illustrate the framework through four ESA case studies where we address tasks requiring machine reading comprehension, text generation, and information extraction capabilities.

The role of AI in science

Figure 1. The role of Artificial Intelligence in Science

We share Gil’s view of a future scenario where AI systems will not only assist but also become an effective part of the scientific ecosystem, collaborating, independently pursuing substantial aspects of research work and engineering work, and contributing their own discoveries to the rest of the community. Focused on space, we adopt a pragmatic mindset and fix the scope of this work on the first two of the three steps of the timeline shown in figure 1, which represents the evolution of the possible roles to be adopted by AI in the scientific ecosystem.

Challenges

The shortage of annotated datasets for the space domain is probably the highest barrier for the adoption of machine learning-based NLP pipelines and particularly neural language models based on the pre-train, fine-tune or alternatively prompt, and predict paradigm. In addition, space is a mission-critical business, where errors can result in large economic drawbacks and even the loss of human lives. Therefore, it is key for space AI systems in general and NLP applications in particular to be not only data-efficient, but also explainable. As illustrated by the commitment of ESA with the fight against climate change, sustainable, energy-efficient NLP models are also of increasing importance.

Methodological framework

Figure 2. Methodological Framework for NLP-based projects in the Space Domain

While the proposed framework (see Figure 2) consists of the traditional phases of a software project, some activities in these phases are specific to NLP projects. Also, the challenges we are faced with in space (dataset scarcity, need for explainability, sustainability, etc.) need to be taken into account and inform the framework itself. One of the main decisions based on such challenges is to decide whether a machine learning approach, a symbolic approach or a combination of both is more appropriate for our project. Our framework makes special emphasis on supporting such decision-making process.

Analysis phase. The goal of this phase is to define the use cases starting from the user requirements, mapping them to NLP tasks, making an inventory of existing resources and needed resources to model the NLP tasks, and finally assessing the use case feasibility. Table 1 illustrates the outcome of this phase for several sample use cases relevant for ESA.

  • Use case definition. The problem description that is sought to be addressed through the application of language technologies.
  • Identify the NLP tasks required for the use case, including information extraction tasks (e.g, entity recognition) to automatically extract pre-specified types of information from unstructured documents, producing structured metadata, comprehension (e.g., question answering) skills so that machines are able to read text, process it, and understand its meaning, and generation capabilities (e.g., question generation) to write text as humans do. Some NLP tasks can span more than one category such as abstractive summarization that requires comprehension and generation capabilities.
  • Assessing resource availability. Taking inventory of the resources (data, models, software, hardware) necessary to address the NLP tasks. Particularly valuable resources include annotated datasets, like SQuAD for question answering, and pre-trained language models like BERT. GPUs are nowadays a common hardware resource to run deep learning models. Other resources useful for NLP tasks also include general-purpose knowledge graphs, e.g. DBpedia and Wikidata, lexico-semantic databases like WordNet, and domain-specific document corpora, databases, taxonomies, and thesauri. Examples of resources for the space domain include the Nebula library, the ESA technology tree, the NASA Scope and Subject taxonomy, and NASA Technical Reports Server.
  • Determining the need for additional resources. Resources for NLP in space are scarce. So, it is often necessary to create them. Among such resources, we highlight: i) labeled datasets used to fine-tune a particular machine learning model, usually a pre-trained language model, in order to solve a particular language task and ii) the extraction of domain-specific terminologies from a document corpus that can be used to extend the coverage of the domain provided by a general-purpose knowledge graph, if we are opting for a knowledge-based approach.
  • Use case feasibility. Now that we know the involved NLP tasks, the resources we count with, the resources we need, and the project resources, budget and schedule we can assess the use case feasibility. Unfeasible use cases are discarded or reformulated, and feasible ones follow to the next phase in our framework.

Design and Development phase. The goal is to specify the software components and their arrangement in a general architecture, as well as the development of such components.

  • Design and Architecture. Following a layered architecture pattern we define the most common components in each layer.
    - Data storage layer. Where text, metadata, and dense representations are stored for efficient retrieval and similarity comparison in an inverted index, an embedding index or other document-oriented database.
    - Data access layer. This layer includes the components to extract data and text from external sources, and components to manage and query the data in the storage layer. Examples of the latter are Search API such as Elastic Search, or document retrievers like ColBERT. This layer can support data preprocessing including cleaning and formatting text and the generation of dense representations for text documents using language models.
    - Domain logic layer. Software components, including NLP components, are orchestrated according to the logic of the functionalities required for the use case.
    - Presentation layer. User interface components necessary for the use case. In integration projects this layer might contain components exposing the functionalities as web services or API. We strongly suggest using the user interface to gather feedback in order to improve the NLP tools and models and as an additional source of supervision for subsequent model training following an active learning approach.
  • Provisioning of resources. Data, corpora, annotated datasets, software, and hardware resources need to be made available for the development activity. We suggest investing at least in the development of a testing set to evaluate the performance of the resulting models in the target NLP tasks. In addition, whenever a machine-learning-based approach is adopted, we advocate for the generation of domain-specific annotated datasets for training
  • Resource evaluation. Training NLP models on poorly annotated datasets leads to underperforming models. Fleiss Kappa or Cohen Kappa are useful metrics to evaluate the annotator inter-agreement in annotated datasets. To evaluate models and tools metrics vary depending on the NLP task. Examples of evaluation metrics are Precision, Recall, F-score, ROUGE, and BLEU.
  • Development and testing. With the specification produced in the design activity, the architecture and the selected resources the software development phase can start. Testing is crucial to guarantee the quality of the end product

Operations. The final phase is to deploy the software and monitor its performance, actively collecting feedback while providing service to the users. Dev Ops and its adaptation to machine learning ML Ops is suggested to enable continuous integration and delivery of the NLP models.

  • Deployment. To make available the software to end users. Depending on the requirements software can be deployed on-premise, in the cloud, or in the infrastructure of the technology provider.
  • Monitoring. It is important to understand if the software is meeting user expectations and use case requirements. Moreover, if feedback mechanisms were included in the user interface, the collected data can be automatically leveraged to re-train or fine-tune the NLP models, following an active learning approach.

Pick your poison: Data hunger vs. the knowledge bottleneck

Many NLP tasks can be solved equally well by using a machine learning-based approach or a symbolic approach. The decision needs to be informed with different criteria according to our framework. First, we look for existing tools or models in the state of the art for each NLP task. If more than one tool or model exists, then some evaluation needs to be carried out to choose between the different options. This is often the case for information extraction tasks of common key information like keywords, phrases, and entities.

Nevertheless, when information extraction is performed on ad-hoc information, e.g. specific entity types corresponding to spacecraft instruments or satellite launchers, it is unlikely to find tools or datasets available for reuse. In this case, it is advisable to either extend a pre-existing knowledge base, e.g. a knowledge graph, with domain terminology and write rules following the symbolic approach, or to annotate domain-specific text with labels corresponding to the types of information we need to extract and then train a machine learning model to do the extraction.

When the project requires other capabilities involving e.g. comprehension tasks such as closed-book and open domain question answering, or text generation, the knowledge-based approach might be less appealing due to the potentially large size and complexity of the rule base and knowledge representation that would be required to address the task. In this case, machine learning-based or hybrid approaches tend to be more promising. Machine learning models and more specifically neural networks pre-trained for language modeling, have shown good results in such tasks.

Annotated Datasets. Even though nowadays we can find generalist resources for comprehension tasks like machine reading comprehension (SQuAD, Natural Questions, TriviaQA) or summarization (CNN/Daily Mail, Gigaword, X-sum), such datasets might not be optimal for space. The mismatch between the datasets and the use case could be not only at the domain terminology level, but also in the type of questions and expected answers.

Fixing errors. When machine learning models provide erroneous predictions, the alternatives to fix such errors are usually limited to training or fine-tuning the model in a new setup, i.e. with new hyperparameters, loss function or a slight modification in the architecture, or to provide more annotated data in the hope that the error will be fixed once the model is re-trained. However, neither training in a new setup nor using more annotated data is a guarantee to solve erroneous predictions. In such case, a possible solution is to use post-processing rules on the model output and fix the recurrent errors.

Use cases

In the following parts of this post we illustrate the application of our methodological framework to four case studies in space. We show how we addressed the language processing needs that such case studies entail and the decisions we made to successfully accomplish them. Table 2 characterizes the different case studies in terms of a selection of the key aspects discussed in this post that are particularly relevant for them.

About expert.ai

Expert.ai is a leading company in how to apply artificial intelligence to text with human-like understanding of context and intent.

We have 300+ proven deployments of natural language solutions across insurance, financial services and media leveraging our expert.ai Platform technology. Our platform and solutions are built with out of the box knowledge models to make you ‘smarter from the start’ and get to production faster. Our Hybrid AI and natural language understanding (NLU) approach accelerates the development of highly accurate, custom, and easily explainable natural language solutions.

https://www.expert.ai/

--

--