AI is helping users to resolve code problems

Meri Torosyan
Sololearn
Published in
5 min readMar 3, 2023

Together with our partner Prosus AI, we explored AI solutions that could automatically translate compile errors into human-readable explanations.

This article has been co-authored with Doğu Tan Aracı, Zulkuf Genc from ProsusAI and Marco Facchini, Meri Torosyan from Sololearn.

Learning to code, like any other skill, requires practice. At Sololearn, we integrate coding exercises into our lessons, where users put their knowledge to the test by solving real-world problems. These exercises are designed to be challenging, and when students hit a snag in their code, it can block them from completing the lesson.

Two main difficulties users face when attempting to solve these practice problems are: either the code won’t run, or it runs but fails to provide the expected output. In this article, we’ll focus on the first issue: when the code seems correct but results in an error upon running. In programming languages, errors that prevent code from running are typically referred to as “compile errors.” A compiler (or interpreter, depending on the programming language) converts instructions to code so the computer can understand and carry out the instructions. When the code is grammatically correct, the compiler turns the source code into machine code, which displays output on the screen. A compilation error, however, occurs when the source code cannot be compiled due to errors in the code. Figure 1 illustrates this compilation process:

Figure 1: code compilation process

While experienced programmers can decipher compile errors and find their mistakes, beginners often struggle to understand what went wrong. This can be discouraging, and many beginners may feel that programming is not for them. At Sololearn, we decided to take action to help our users overcome this initial hurdle in their coding journey. Our goal was to make compile errors more understandable. We knew that it would be impossible to manually explain each error, as there are millions of possible errors that users could encounter. Therefore, we turned to AI. Together with our partner Prosus AI, we explored AI solutions that could automatically translate compile errors into human-readable explanations.

We used GPT-3 (text-davinci-002 instance), a powerful auto-regressive language model developed by OpenAI, to generate explanations for us. This model can generate human-like text for various applications, and is the technology behind popular products like ChatGPT and Bing Search. We created a prompt with a short description of the task and a few examples of (code, error, explanation) what we wanted to achieve, then sent this prompt to GPT-3 via the OpenAI API. The results were very promising for production deployment!

Our experienced Curriculum Team manually evaluated the explanations generated by the AI and provided feedback to guide the model for improving generations. Once we reached an accuracy of 80%, we launched an A/B test. Half of the users received the old compile errors, while the other half received the human-friendly explanations along with the original compile errors. This way, we could help users understand the errors while also teaching them how to read compile errors in the future.

To help further improve the model, we implemented a feedback mechanism that allowed users to give direct feedback on a given explanation. This helped us improve problematic explanations and enabled users to actively participate in improving our product. This is possible because of Sololearn’s large and active community eager to share feedback, which is a key ingredient for making such applications successful.

However, we found that GPT-3 was too expensive to use long-term, especially with 18k new requests every day. So, we created a training set from the good generations of GPT-3 and user feedback to train a smaller model: a T5 model. T5 (Text-to-Text Transfer Transformer) has a Transformer-based architecture and uses a text-to-text approach for tasks like translation, question answering, and classification. We used the large T5 model from Salesforce’s AI team, which specializes in code. The advantage of using T5 is that we could deploy it ourselves and scale it according to our needs.

The results of our A/B test were very positive. We found that users who received the error explanation (test group) were statistically significantly more likely to solve the practice problems compared to users who only saw the compile error (control group). We also found improvements in lesson completion and next day retention. Furthermore, user satisfaction with the explanations was around 80%, and when asked what they think about the feature we got a number of answers similar to the few presented below:

“I just tested it with Python for Beginners practice tasks and I think it already works quite well: Out of about 15 test errors, only 1 hint was way off, 3 messages hinted at least at the correct line/ part but not directly concerning the actual error. The other ones really fitted!” (Lisa)

“Yeah it’s really helpful. It tells the exact error with some hint so everyone can understand the problem!” (Somya Rajawat)

Since we went live with it, our AI system has explained 1.5 million Python errors for our users. As we move forward, we plan to roll out the feature to all Python users and gradually train the model to work with other programming playgrounds. We also aim to translate the explanations into Spanish and Russian to increase the number of users who can benefit from the explained compile errors in our Sololearn practice tasks.

In conclusion, our goal is to ensure that every time you encounter a compile error, you don’t feel discouraged. Errors in code are inevitable, and all you can do is understand them, step by step.

Fun fact: even this article was brought to you with the help of an AI editor :)

--

--