Introducing Ask Roboflow
The AI that answers programming questions.
A month ago I released a little project called Stack Roboflow, a neural network trained to ask programming questions by mimicking human questions from Q&A site Stack Overflow. A great writeup of Stack Roboflow is available here.
Today, I’m happy to announce the next generation of that neural network, Ask Roboflow, a machine learning model trained to answer programming questions.
What is Ask Roboflow?
Ask Roboflow is an attempt to teach an AI bot to learn from that vast trove of information so it can answer programming questions all on its own.
Why would anyone want this if they already have Stack Overflow?
Stack Overflow is an invaluable resource for experts. You can find answers to some of the most obscure questions imaginable… and if your question hasn’t already been asked, there’s a good chance someone on the site can help you solve your issue.
Unfortunately, it can be a very intimidating place for new coders. There’s a zero-tolerance policy for “duplicate” questions and moderators are quick to downvote and close questions that don’t meet community standards. Many beginners haven’t yet developed the skills necessary to successfully find the answers to their questions which sets them up for a bad first experience (even driving some away from coding completely).
Luckily, it’s exactly these types of newbie questions that an AI should be able to answer. And an AI never gets frustrated with answering the same questions over and over again. In fact, seeing the same question in many different ways can even help the model to identify it better in the future.
If we could teach a bot to answer common questions, it would be incredibly useful to new programmers and let the experts spend less time moderating duplicates and more time answering novel questions.
How does it work?
At its core, Ask Roboflow is a language model. This means it is a neural network trained to predict the next token in a sequence. It is similar in concept to OpenAI’s GPT-2 or Google’s BERT but, behind the scenes, it is using Steven Merity’s AWD-LSTM which is built into fastai.
The hope is that, in order to learn how to effectively mimic human responses, the neural network will have to learn generalized information about the underlying programming concepts.
Language models trained on other corpora have shown that they can beat humans on reading comprehension tests so I wanted to see how well one could answer programming questions when trained on the world’s #1 source of programming information.
Bonus: Want a peek behind the curtain? I live-chronicled the creation of Ask Roboflow in this twitter thread.
Does it actually work?
The neural network has learned all sorts of interesting things that helps it convincingly mimic human technical writing. In addition to learning basic English grammar, it has learned many domain-specific things including
- The unique syntax of several different programming languages
- How to insert HTML tags to format text (like adding code snippets, bolded text, images, and more)
- That many answers contain links to documentation (and which websites go with which programming languages).
But, while the output does produce some convincing answers, one drawback of optimizing the network only on predicting the next word of the sentence is that it has no way to optimize for correctness of the answer as a whole.
For example, take the question “What color is the apple?” If the canonical answer is “The apple is red,” the following two answers would get the same “accuracy” score: “The apple is green” and “An apple is Red” (each one got 3/4 words correct). But, clearly, it should lose more “points” for missing “red” than for missing “the”.
This drawback means that Ask Roboflow isn’t yet useful for answer real peoples’ programming questions. But it is certainly a fun diversion!
One of my main goals with the project was to learn about machine learning and I’ve learned a lot from creating Ask Roboflow. Now that I have something up and running, I plan on experimenting more with improving the model.
One piece of low-hanging fruit is throwing more compute at the problem. The network running on Ask Roboflow was only trained for about 48 hours on a subset of about 1/8 of the data. OpenAI has shown that scaling up language models can provide very impressive results.
Trying other model architectures is another area of exploration; I’d like to compare how OpenAI’s GPT-2 or Google’s BERT compares to the AWD-LSTM on this dataset.
Another idea is to augment the results with human input. If the model gets the “correct” answer 10% of the time, letting humans downvote the incorrect answers will surface the good ones to the top.
I’d also like to play with the “loss” function that guides the model in the right direction. One idea I want to try is rewarding it more for getting uncommon words correct.
And, finally, I hope to explore some adjacent problems. I was very impressed at how well the model learned the proper syntax of multiple programming languages. I’d like to see what I can do with a model trained on open source code. One potential idea is to apply neural machine translation to computer code to translate it automatically from one programming language to another.