Ontology-based Meeting Intents Detection for Automatic Email Answering
The work presented here aims to improve the Linagora’s OpenPaaS open source collaborative platform for businesses — which includes among others a mailbox, a corporate social network and a shared calendar. This study focuses on automatic answering to appointment emails in French language.
Companies have many productivity tools that offer varied possibilities for remote communication. These tools allow to easily manage different projects that involve different partners at the same time. Among them, email still stands as a tool of choice with more than the half of the Mondial Population for users according to the Radicati Group 2017 Report. This same report also estimates that an average employee receives 88 emails a day and sends 34. Email would represent between 5 and 10 hours of an employee’s time over a month.
But a too massive use of emails can heavily impact work productivity by causing information overload among employees who receive too much to manage them effectively. Moreover, the Radicati Group studies reveal that 43% of French employees say that they are interrupted at least once every ten minutes and 31 % admit to being distracted in their work.
An email processing assistant
Considering this problem of email management, our goal is to propose an email assistant tool that would be able to help user by automatically prioritizing emails, sending notifications when urgent emails are received and generating answers. This article will focus on the latter objective.
Some tools already exist for automatic email answering like Google’s Gmail Smart Reply or Outlook’s Message Management Assistant. But these tools only provide short answers that not always fit for professional use. Moreover, they are not open source. Our goal is to propose an approach and a tool to provide the most complete answers possible to avoid the recipient having to request further information in a new email.
Our tool is able to analyze incoming emails and decide which one requires a response and then retrieve relevant items to be able to generate an adequate answer. This response would then be submitted to the user’s validation or modification.
This task needs to address two aspects: the email classification — to find those that require an answer — and the answer generation according to the detected user intent (i.e. the action he wants to perform or the purpose he wants to achieve by writing a sentence).
For instance, an email that only contains advertising does not require an answer. However, an email that contains a request for information about an appointment requires one. More specifically, it requests an answer that contains information about an appointment.
Ontology-based intent detection
Our approach is based on an ontology. An ontology describes and represents an area of knowledge about a particular domain. It is a semantic graph in which nodes represent concepts (each concept can have one or more properties) and relationships between them. There are two kinds of relationships: hierarchical relations that model subsumption links between concepts (e.g. the cat concept belongs to the animal concept) and non-hierarchical links (e.g. zebra can eat leaf). To formalize the ontology, we used the OWL language (Web Ontology Language) proposed by the World Wide Web Consortium’s (W3C) and widely used in the Semantic Web domain. We built the ontology with the Protégé tool — that support the OWL language — to describe meeting emails from a large corpus of business emails.
This ontology uses the formalization of FrameNet — which decomposes semantic frames into frame elements (FEs) that are themselves evoked by the presence of words called lexical units (LUs). These frames are only checked if the frame elements are present in a sentence. For example, in case of proposing an appointment by email, the semantic frame “appointment proposal” in the sentence “Je vous propose de faire une réunion jeudi prochain” (I propose to do a meeting next Thursday) can be detected thanks to the lexical units “propose” and “réunion” and the frame elements SPEAKER (i.e. “Je”), PROPOSITION (i.e. “propose”) and TIME (i.e. “jeudi prochain”).
Each concept of the ontology represents a different meeting intent with frame elements and lexical units that allow to detect them. The ontology contains 18 intent concepts grouped into 3 main concepts representing types of intents: request, proposal and notification.
The tool : how to detect intents in emails?
The smart reply feature is developed in Java. For each email, an annotation module annotates the email text with the LUs and FEs of the ontology concepts. This module outputs the three most likely intents for each sentence of the email. It retrieves the ontology annotations and projects them on each sentence of the email text to decide whether to assign them an appointment intention. This module is only based on Open Source APIs like Core NLP (for tokenization, POS tagging and detection of named entities), Duckling Facebook (to detect named entities that contain numbers), Snowball Stemmer (to make the matching between LUs and words of the text easier) and Jena (to read the OWL format).
The figure below illustrates the order in which APIs are used to project ontology’s elements onto the text and to detect intents.
How to generate answers for emails?
Each intent concept is associated to several answers patterns that are specified in a separated concept of the ontology. For example, the appointment reminder concept is associated to 2 answers canvas representing the appointment changing request and the appointment confirmation. Thus, when the user receives an email containing the sentence “Je vous rappelle que l’entretien est prévu mardi à 15h”, he can indicate that he will be present or ask to postpone the appointment if the date does not suit him.
The generated answers are adaptable according to the content of the email thanks to the syntax of the canvas that allow retrieving relevant elements in the text. For instance, if the incoming email contains the term “réunion” instead of “rendez-vous”, the answer proposed to the user must contain the same term. The figure below shows the answering canvas syntax for an appointment changing request.
When the user selects “Demander déplacement d’un rendez-vous” to answer to the precedent appointment reminder, he has access to a pre-filled email containing the sentences “Je ne serai malheureusement pas disponible pour ce créneau. Pouvons-nous trouver un autre créneau pour l’entretien ?” (Unfortunately, I will not be available for this slot. Can we find another niche for the interview ?). The tag between braces retrieves the occurrence of lexical units which makes it possible to deduce that the sentence contains an appointment intent (i.e. “entretien”) by adapting its determinant in accordance with its gender.
A greeting is automatically added at the beginning of each response. This is adapted according to the time of writing the new email and according to the recipient. For example, if the user decides to use our answers between
4 pm and midnight, the answer will begin with “Bonsoir” (Good evening) instead of “Bonjour” (Good morning / Good afternoon).
To adapt the answer to the person, the system considers the recipient’s email address — which often contains the name of the company in which he works. If the recipient works in the user’s company, the system simply inserts the greeting and the person’s first name (1).
If not, it checks a first name list to identify if it is a man or a woman. Depending on what it finds, it inserts “Monsieur + NAME” or “Madame + NAME” (2). If the first name is ambiguous or absent from the first name list, the system keeps “FIRST NAME + NAME” (3).
Here is an illustration of each case knowing that the user works at Linagora:
(1) Writing at 7 pm for email@example.com
→ “Bonsoir Benjamin”
(2) Writing at 10 am for firstname.lastname@example.org
→ “Bonjour Madame Dupont”
(3) Writing at 11 pm for email@example.com
→ “Bonsoir France Dupont”
Integrating the smart reply feature into OpenPaaS
The two modules described above are integrated to the OpenPaaS UnifiedInbox module. The global system finally has four main modules :
- An ontology manager that parses the ontology with the Apache Jena API to exploit the annotations it contains.
- The intent detector that extracts the intents from the text by computing a relevance score for each intent concept of the ontology.
- The answering generator that generates answers for each detected intent.
- A Web Service that returns the three first answers in a JSON format.
The OpenPaaS UnifiedInbox module communicates with our answers suggestion service through a Web Service called by a REST API. For each email, the Inbox module sends a JSON request to the Web Service with the text of the email and the associated metadata. The system analyzes the request, detects intents in the text and returns a JSON response with the suggestions of answers.
A request is generally processed in less than 150ms. By selecting the appropriate button, the user has access to a pre-written response that can be modified, completed or simply sent. The sending must necessarily be validated by the user who can change his mind and choose another answer if the first does not suit him.
Evaluating the system
We evaluated our system according to two criteria: its performance in intent extraction and the relevance of the answers it proposes.
We first compared the performance of our system for the intent detection against machine learning models trained with the corpus we used to construct the ontology. The different systems were tested on another corpus of corporate emails containing (or not) various types of meeting intents. Our system presents far better results than the machine learning based models with an f-score of 69% against 24%.
We then compared the relevance of the answers proposed by our system and by Google’s Smart-Reply for 15 emails through a questionnaire distributed to 26 Linagora employees. Our answers are preferred for 8 emails out of 15 with an agreement of more than 75% for 5 emails against 7 emails out of 15 for the Google’s system — with an agreement of more than 75% for only 2 emails. The answers proposed by our system are selected by at least one annotator in all cases.
These experiences show that our approach allows providing relevant and acceptable answers that can fully compete with those proposed by the Google’s System. Moreover, our answers seem to be preferred for cases where the answer requires mentioning specific elements such as availabilities or precisions about an appointment.
And what’s next?
We are planning to link the answering mechanism to the OpenPaaS calendar to automatically check availabilities and generate answers depending on the user’s constraints. We also plan to detect and answer to other intents as administrative workflows (for instance, vacation requests or mission orders).
Keep in touch with the Linagora labs team, we will let you know about all of our working progress…