[Everyone’s AI] Explore AI Model #11 Is there a chatbot that goes beyond the GPT-3? BlenderBot 2.0

AI Network
AI Network
Published in
6 min readAug 23, 2021

[Editorial Note] This article has been planned and published as part of a series of articles related to open-source AI models and insight sharing by Sung Chang-yeop, who is a Developer Relations Engineer at Common Computer. This is the 11th article so far and its mainly about BlenderBot 2.0, a chatbot released by Facebook AI.

Facebook AI, who recently released BlenderBot 1.0, has now released BlenderBot 2.0 (an upgraded version of BlenderBot 1.0). Let’s take a look at the improvements and features of BlenderBot 2.0 over 1.0 and try it out for ourselves.

If you want to check out BlenderBot 2.0 right away, please click on the following link.

Demo : https://link.ainize.ai/3z6sgq2

Github : https://link.ainize.ai/3gpLPm5

In April 2020, Facebook AI released BlenderBot 1.0, its largest open domain chatbot, as open source. BlenderBot 1.0. has been overshadowed by the Generative Pre-trained Transformer 3 (known as “GPT-3”) unveiled a few months afterwards in June 2020, as it didn’t get as much attention as the GPT-3, but BlenderBot 1.0’s capabilities were still outstanding. BlenderBot 1.0 is the culmination of years of research into conversational AI and it is said to be the first chatbot to combine various conversational skills, such as empathy, knowledge, and personality, in one system. In fact, the reason it is named BlenderBot is because it was used for training by mixing several data sets. (Click here for more information on BlenderBot 1.0.)

BlenderBot 2.0: builds long-term memory and searches the internet

I will refer to BlenderBot 2.0 as “2.0” and BlenderBot 1.0 as “1.0.”

If you would ask me to tell you about the features added in 2.0 (compared to its predecessor 1.0) I would have to mention the following two characteristics:

  1. Long-term memory
  2. Searches the internet

Long-term memory

When 1.0 was released, reviewers said it was like having a conversation with real people, as it seem like it was creating a much more engaging story than any other chatbot, including Google’s Meena. But there was a problem with the seemingly perfect 1.0. It had a memory problem. This problem, called Goldfish Memory, happens when chatbots talk with humans over an extended period of time. The more a chatbot talk to humans, the more they start to lose memory of previous conversations.

When we talk to people, the prior knowledge we have about each other is important. For this reason, for 2.0, long-term memory was implemented and optimized to solve the previous problem. However, since there is still a possibility of missing important information the information is inserted without any pre-processing, we used an encoder and Mem decoder model to condense and store information.

Writing conversation information to memory (source: Facebook AI blog)

By introducing long-term memory, you can recall topics discussed in previous conversations, and as you use these shared memories in conversations, you will feel like you are talking to a person rather than a machine. In fact, 2.0 reported a 17% improvement in engagingness score and a 55% improvement in using older conversations compared to 1.0. Engagingness here refers to the ability to continue the conversation where the previous conversation session ended.

Searches the internet

Models such as 1.0 can only process the information of the data at the time of learning, so there is a disadvantage when new information is not updated. For example, if you ask 1.0 about the recently aired drama WandaVision, you won’t get an answer. Because when 1.0 was trained, there was no information about the drama called WandaVision.

But 2.0 is different. 2.0 is not limited to trained data, but related information can also be accessed through Internet searches. When you mention something you’ve never seen before, you can generate an Internet query for that content and get search results to keep the conversation going.

Searching the internet for Wandavision (Source: Facebook AI Blog)

The advantage of doing this is that the data that 2.0 can leverage is not out of date. I think that the current trend in AI is to learn large-capacity models and store what they have trained in the model’s weights. However, it is almost impossible to store information that is constantly changing and added. 2.0 can be updated with new information because it goes through the process of accessing and retrieving into the Internet immediately, rather than storing all the information in weights.

In fact, when comparing the ability to use knowledge of 1.0 and 2.0, the rate of “hallucinations” decreased from 9.1% to 3.0%, and the rate of telling the truth in a conversation increased by 12%. In this case, the word “hallucinations” can be seen as a chatbot talking about inaccurate knowledge without question.

BlenderBot 2.0 Architecture

Let’s take a quick look at how 2.0 works. First, the conversation with the user is stored in the long-term memory through the encoder and mem decoder. Then we move on with the process of sending the query made through the Query Generator to the Internet and long-term memory to find related information, combining these two information and sending it to the decoder in order to output the final response.

BlenderBot 2.0 Structure (Source: Facebook AI Blog)

Usage

(The server for Internet information search is JulesGM’s ParlAI_SearchEngine, and the search platform is Google.)

  • Using the demo

Let’s try BlenderBot 2.0 using the demo provided by Ainize.

After entering the conversation, click the Submit button and you will be able to talk to the BlenderBot 2.0 (it may take some time if you need to search for information on the Internet). The demo is available at the link.

Facebook AI said, “We think that these improvements in chatbots can advance the state of the art in applications such as virtual assistants and digital friends.” In the past, chatbots were difficult to commercialize because they were not personalized, but in 2.0, long-term memory was added to give chatbots the memory that humans have, making personalization possible. As time goes on, I also think that chatbots can become AI virtual assistants or digital friends, such as Facebook AI has said.

Reference

AI network is a blockchain-based platform and aims to innovate the AI development environment. It represents a global back-end infrastructure with millions of open source projects deployed live.

If you want to know more about us,

--

--

AI Network
AI Network

A decentralized AI development ecosystem built on its own blockchain, AI Network seeks to become the “Internet for AI” in the Web3 era.