Can Machine Learning Really Estimate How Well You Know a Foreign Language?

Alexei Falco
Celadonsoft
Published in
5 min readSep 11, 2019

We all know that Machine Learning is awesome in processing big data sets. Based on this data, ML can build accurate forecasts for the future and predict certain events or behavior. But what about assessing one’s skills in real-time?

One of the biggest motivational factors in any learning process is the ability to track progress. For a swimmer, that would be the decrease in time spent on a certain distance. In programming, you will be able to track the process by the acquired skills.

Observation of the progress is an incredibly powerful stimulus to continue learning, set new goals, and move forward. Things are a bit different though with foreign languages.

For new learners, it’s hard to accurately estimate the level of knowledge and progress status. Things get even more complicated if a person learns a new language only for a specific goal, i.e. to communicate with people at conferences.

Because the language learning process is so individual and does not have clear milestones, students may start skipping lessons or even give them up. It’s relatively easy to track the number of new words that you learn via a mobile app — but how does one measure the level of knowledge of grammar or oral skills?

Skills fragmentation

All of us have learned subjects at school that we don’t use at all in our everyday life. It’s safe to assume that you don’t remember much from these lessons.

The situation with language learning is somewhat similar. As we mentioned above, many students learn a new language only to solve a particular issue. But different issues require different skills and each of these skills will have their own priority.

As well, language is not as precise as Math. Learning a few tenses will not make you an expert in grammar — but you cannot say that you know nothing either. Summing up, linear method of skill evaluation does not apply for language learning.

The Pearson company conducted a massive study within its Global Scale of English research. The company asked more than 6000 teachers for assistance with identifying the basic English language knowledge and skills and range it, depending on the complexity of learning. This study served as a base for the ML model that was developed by a Skyeng company.

In general, we can identify the following components of language learning: grammar, listening comprehension, oral skills, reading, and writing.

Each of these components incorporates a variety of different skills that are needed to learn the component in full. And in order to fully comprehend the language, a student should know not only the “abstract” things (i.e. “have a good knowledge of tenses”) but be able to apply the skills in real life, like ask the doctor for a recipe or understand the speech.

The Skyeng ML model contains over 1200 of such “useful” skills. The model selects a unique set of skills for every student depending on their goal and adjusts the learning process correspondingly.

This sounds awesome but imagine: how much time and resources one needs to assess a set of over 100 skills in real-time. This is where Machine Learning steps in.

The first thing that ML does is setting the dependencies between the skills. The dependencies may be:

  • Grammar: if a student knows how to ask questions in the past, he will ask them correctly in the present;
  • Cognitive: if a student can say something, he may as well understand it when hearing one’s speech;
  • Specific: when one skill is related to another (e. if you can call for a taxi, you can book a hotel room).

By going through a set of skills in a repetitive manner and assessing them with the help of related skills, the model offers a comprehensive view of one’s knowledge. So a fragment of existing skills would be enough to get an approximate picture of a student’s skill set.

The model also has many points of interaction with the students. This means, when a student performs an action within a platform, he will immediately get feedback on progress. In this way, the model collects all the information, connects it with the skills from the database and thus evaluates one’s progress.

The only flaw of the model is the complexity of assessing the grammar and writing skills in fully automatic mode. For example, if a non-native speaker pronounces a word and there is a mistake, the model may not identify the mistake but will confuse the word with the similar-sounding one. It may take months and even years for the model to learn so for now, it works in a semi-automatic mode, meaning there are teachers who give grades. The model’s algorithm then connects the grades with similar skills.

More insights into the educational process

Developers behind the model understand that the primary goal of every new student is to learn and thus, they paid special attention to make sure the model precisely chooses the most suitable course and the teacher. With the help of the skills assessment model, teachers can also offer a corresponding set of skills to learn.

Even the first lesson can already provide useful information on the student’s knowledge by assessing the gap between the existing and the desired skills.

The model is in constant upgrade mode in order to eliminate any errors and serve students in a better way. Right now it’s in the beta mode but the final goal is a 100% automated work and dynamic adaptation to the student’s need during the learning process.

__________________________

Originally published on celadonsoft.com
Read more in our blog!

--

--