AI and Natural Language Processing and Understanding for Space Applications at ESA

Part III: Generating quizzes to support training on quality management and assurance in space science and engineering

5 min readNov 2, 2022

By José Manuel Gómez-Pérez, Andrés García-Silva, Rosemarie Leone, Mirko Albani, Moritz Fontaine, Charles Poncet, Leopold Summerer, Alessandro Donati, Ilaria Roma, Stefano Scaglioni

This post is a brief overview of a paper that is currently under review in a journal (see preprint here) where we describe the joint work between ESA and expert.ai to bring recent advances in NLP to the space domain.

We have split the post in several parts:

Part I: A methodological framework to develop NLP-based applications for space documents
Part II: Answering questions about the design of space missions and spacecraft concepts
Part III: Generating quizzes to support training on quality management and assurance in space science and engineering
Part IV: Information extraction for Long-Term Data Preservation in space
Part V: Assisted evaluation of the innovation potential of OSIP ideas

Introduction

ESA makes a continuous effort to train their staff in quality procedures and standards. Trainees are evaluated to determine the effectiveness of the training sessions, with quizzes as one of the main tools used in such evaluations. However, quality management procedures evolve through time and it is hard work to keep the evaluation material up to date.

In this case study, we focus on SpaceQQuiz (from Space Quality Quiz), a natural language generation system designed to help trainers to generate quizzes from documents describing quality procedures. Quality procedure documents cover topics like Anomaly and Problem Identification, Reporting and Resolution or Configuration Management, and include stakeholder responsibilities, activities, performance indicators and outputs, among others.

Analysis

Focused on text generation, this case study falls mainly in the category of comprehension-intensive NLP projects and complements the open-domain question answering case study. It also leverages the same annotated dataset (SQuAD). However, in this case in addition to BERT-style transformer language models (used here to validate the quality of the generated questions) it relies on pre-trained generative language models for question generation.

The SpaceQQuiz system architecture

Figure 1 shows the high-level architecture of SpaceQQuiz. A question generation model is run on each passage extracted from the document. The generated questions and the corresponding passages are fed to a question answering model that extracts the answer from the passage. Only questions with answers are included in the candidate list that is then refined by the trainer to generate the quiz.

The process starts when the trainer uploads a quality procedure document. SpaceQQuiz extracts the text from the PDF document using Apache PDFBox and uses regular expressions to identify sections, subsections and paragraphs while removing non relevant text such as headers and footers or the table of content. The trainer is presented with a list of candidate sections so that she can choose the most interesting ones for the quiz.

Figure 1. SpaceQQuiz system architecture

Models for Question Generation and Question Answering

We use state-of-the-art models based on transformers for question generation and question answering. Since we could not find specialized models for the space or quality management domains, we reused models already pre-trained on general-purpose document corpora and fine-tuned on SQuAD.

To generate the questions we use a T5 model and a BART model fine-tuned for question generation using SQuAD1.1. The models reused in this work were fine-tuned by their authors following the documentation at: https://github.com/patil-suraj/question_generation. We use two models in order to increase the number and variety of questions for each text passage.

During generation, we use beam search as decoding method, with 5 as number of beams. Beam search keeps the most likely sequence of words at each time step and chooses the final sequence that has the overall highest probability.

Once the questions have been generated we use a RoBERTa language mode fine-tuned for question answering on SQuAD2.0 to extract answers from the passages. If RoBERTa fails to generate an answer for a generated question we remove it from the candidate list of questions presented to the trainer.

Quiz Generation

The trainer can select specific questions to include in the quiz by selecting them from the list of generated questions, answers, and passages displayed by the SpaceQQuiz user interface. Finally the system generates the quiz with a section containing only the questions to be handed to the trainee and another section reserved for the trainer with questions, answers and passages.

Evaluation

To evaluate SpaceQQuiz, we generate a quiz with 50 question-answer pairs from a quality procedure document titled \textit{OPS Procedure for Configuration Management}. Then, a quality management expert evaluates the generated questions using relevance and correctness as evaluation criteria.

In total 66% generated questions are considered relevant and grammatically correct and 60% of the answers are also regarded as correct by the evaluator. If we focus only on the answers of relevant and correct questions then the percentage of accurate answers improves to 81.8%.

The level of accuracy for the question generation still requires keeping a human in the loop in order to guarantee the quality of the questions in the quiz. Ultimately, it is the responsibility of the domain expert to decide upon the selection of questions to be included in the quiz.

About expert.ai

Expert.ai is a leading company in how to apply artificial intelligence to text with human-like understanding of context and intent.

We have 300+ proven deployments of natural language solutions across insurance, financial services and media leveraging our expert.ai Platform technology. Our platform and solutions are built with out of the box knowledge models to make you ‘smarter from the start’ and get to production faster. Our Hybrid AI and natural language understanding (NLU) approach accelerates the development of highly accurate, custom, and easily explainable natural language solutions.

https://www.expert.ai/