Making the most out of our learners’ feedback with the help of NLP

Published in

Lepaya Tech

8 min readNov 29, 2022

Lepaya helps professionals to continuously upskill and create better performing organizations. We combines high impact training with technology — empowering professionals to learn and grow when, where, and how it suits them best: on-demand when possible, face-to-face to maximize impact.
The Lepaya app allows learners to provide feedback after each training session. Due to Lepaya’s rapid growth, even more learners will fill out the feedback form in the future. The amount of feedback a Lepaya employee has to read will ultimately become unmanageable. Therefore, we began searching for a solution because trainee feedback is extremely valuable to us.

I am Emma, and I am currently studying Robotics for my master’s degree at the TU Delft. For the past three months I have been a Data Science Intern at Lepaya. The goal of my internship was to use Machine Learning (ML) and analytics to extract meaningful insights from the post-session feedback text data. Together, let’s explore how we can use Machine Learning and Analytics to extract meaningful insights from text and create a better feedback form! 🚀

🧐 Problem formulation

Currently, the post-session feedback form consists of two mandatory ratings and three open optional questions. A first rating is given for the training, and a second rating is given for the trainer. Each rating is followed by an open question asking the main reason for the rating. In the final question, we ask for suggestions regarding what changes should be made for the next meeting.

The use of Natural Language Processing (NLP) can be an effective method of processing the large volume of written feedback. As a branch of Artificial Intelligence (AI), NLP is about giving computers the ability to understand and/or generate text and spoken words in the same way that humans do. NLP faces a number of challenges, of which I will outline a few.

Language ambiguity 🤯

Words can have different meanings depending on the context they are used in on lexical and syntactic levels. Lexical ambiguity refers to the ambiguity within a word. Syntactic ambiguity is the situation when a sentence may be interpreted in more than one way due to ambiguous sentence structure.

Misspellings✍️

Typos are often made in writing. Most of the time, this occurs when people try to convey meaning through their writing, which is a very high level task. To deal with the more complex task of conveying meaning, the brain partly ignores the spelling task, resulting in misspellings. It is possible for the model to miss important words due to spelling errors, resulting in it not understanding the text.

Different Languages 🤨

Each language has its own set of vocabularies, grammar rules, and cultural expectations. For a model to understand a certain language it needs to understand these three building blocks. Often, a model must be retrained in order to comprehend a new language.

Available training data 📂

NLP models require specific training data in order to be customized for a particular task. In the public domain, it is difficult to obtain high-quality data that is appropriate for a given domain, language, or task. Moreover, annotating text yourself is an extremely time-consuming process.

⚙️Approach

The first step in cleaning up the text data is to remove answers such as “none” or “nothing” which indicate a person didn’t want to answer the question. Some people refer back to their first answer by typing words such as “idem” in the second open question about the trainer. If this occurs, the answer from the first question is copied to the second question.

The second step of the process involves applying a Language Detection model from SpaCy to only keep English sentences.

In the next step, the Named Entity Recogntion (NER) task is performed using a combination of a pretrained English transformer roBERTa-base model from SpaCy and simple queries. The objective of NER is to identify and categorize entities mentioned in unstructured text, such as individuals, organizations, locations, medical codes, time expressions, quantities, and percentages. The entities PERSON and TIME were the only useful entities we could get from the pretrained model. Due to the lack of high quality annotated data, a blank SpaCy NER model has been trained on own annotated data. However this was very time consuming and the internship time was limited. Therefore, simple queries have been made to detect TECH, CONTENT, and COMFORT along with additional queries for PERSON and TIME.

As final step a sentiment analysis of the feedback is performed using a pretrained roBERTa model from Hugging Face. A sentiment analysis is an NLP technique used to determine the level of positivity, negativity, or neutrality in data. The sentences are rated as negative, neutral, and positive according to three percentages.

📊Results

General 👀

Approximately 28% of the learners filled in the first open question of the feedback form about the training of which 66% were detected as English. 22% of the learners filled in the second question of the feedback form, of which 64% were English. And the last question is answered by 16% of the learners of which 53% is answered in English. It should be noted that more comments might have been English, but because of misspellings it is detected as another language.

The mean length of the sentences for every open question can be seen in the figure below. The error bar shows that learners’ responses vary greatly with regards to how many words they use.

Named Entity Recognition [🙋,⏳,📱, 📚 ,🍦]

The entity PERSON 🙋 retrieves names, pronouns like [“he”, “she”, …] or titles such as [“trainer”, “coach”, “facilitator”, …]. This entity is mostly found in the question about the trainer. Nevertheless, around 12% of the answers to the first question (about the training) contain this entity as well! All feedback about trainers can now be accessed without having to read all the answers off every question.

Approximately 30% of the sentences in the last question (about what you would like to change in a training) contain the entity TIME ⏳. Learners advise that certain parts of the training be extended or shortened. Additionally, they mention situations when they’d prefer a different date for the training (due to their own work deadlines). Furthermore, they request that travel time be reduced if their homes are too far from the training accommodations. For many Lepaya Teams, this information can be useful without them having to read the other 70% of the feedback. With this information, trainings can be scheduled during convenient times and some trainings can be shortened or extended as needed.

Around 20% of the last question also mentions the entity TECH 📱. This occurs when learners run into problems with the Lepaya app. Or when they encountered problems with Teams/Zoom/etc. during their online sessions. Learners have also expressed a preference for person-to-person training instead of an online training in this entity. With this information, technical issues can be detected earlier.

In the first question, approximately 77% of the feedback also contains the entity CONTENT 📚. Here you will find everything related to the content of a training. Comments on the impact of the training, the level of the training, frameworks, methods, tools, strategies, slides, roleplay, video, language, assignments, topics, etc.. Eventually it would be very useful to divide the entity CONTENT in sub classes which correspond to different Lepaya teams.

In the last entity COMFORT 🍦people refer to the longing for healthy snacks, warm rooms, more air and warm coffee. It is only mentioned around 300 times. Nevertheless, to ensure learners have an outstanding training experience, such details must also be met with a high level of detail.

Sentiment Analysis [😄,😐,😠]

The figure below shows the mean sentiment scores [negative😠, neutral 😐, positive 😄] for every Trainer or Training Rating [rating from 1–10].

In the first two open questions about the training and the trainer, the sentiment scores scale exactly as expected. When learners give a 10, the mean of those sentiment scores is high for positive and low for negative, it is also true for the other way around. In these two questions, neutral scores are also very clear; ratings between 4 and 6 have higher neutral scores than ratings outside of these bounds. A very interesting aspect of the last question is that it doesn’t ask for hard opinions, but rather for suggestions for what to change. Learners are giving suggestions which are detected as more neutral!

Plots showing the impact of entities on sentiment scores and ratings are shown below.

The sentiment score actually increases when the entity PERSON is mentioned. When learners mention trainer names in feedback, they usually thank them or express their satisfaction. Due to the fact that not everyone uses this entity in their writing, it does not cover all the comments from the second question. How to extract those comments without clear enitities is also a very interesting topic!

When the entities TIME or TECH are mentioned, it is very clear that the comments are more negative and less positive.

CONTENT produces slightly more positive answers, but not significantly more. It may be due to the fact that CONTENT is a very broad entity right now. Most of the selected answers are very positive, but there are also a lot of negative scored answers, resulting in stagnated scores. Positive and negative scores could be extracted more effectively if CONTENT were divided into more specific classes.

The last figure shows the effect of the entity COMFORT, the model agrees with the learners that having too few snacks, sitting in a too warm room, or drinking cold coffee should definitely be scored as more negative - we couldn’t agree more!🍦

What’s next? 🚀

The results indicate, that there are ways to adapt the feedback form. With the entities now identified, learners can be asked more specific questions regarding the enities. This will enable us to gain a better understanding of the specific changes learners would like to see during our trainings. This could help Lepaya gain more specific insights into learner feedback, as it allows for better data, better analysis and quicker detection of outliers.

Overall, Named Entity Recognition and Sentiment Analysis are powerful tools for narrowing down quantitative feedback. If the entity CONTENT could be divided into several sub classes which correspond to different Lepaya teams, it would really be worth the time to start training a blank NER model, train it for multiple classes and send the relevant feedback directly to the relevant teams.

At Lepaya we are continuously exploring the timing and methods of asking for feedback during programs, as well as what information should be collected for different teams. I am very grateful for the opportunity to contribute to this research as an intern, and I can’t wait to see the results in action! 🙂