Making Dream Happen
Authors: Dilyara Baymurzina, Denis Kuznetsov, Fedor Ignatov, Maxim Talimanchuk, Mikhail Burtsev, Daniel Kornev
“So many of our dreams at first seem impossible, then they seem improbable, and then… they soon become inevitable.”
– Christopher Reev
Table of Contents
- Dream: A Powerful Word
- So, What Is Dream All About?
- Positioning: Dream vs. RASA vs. Mycroft vs. Amazon Alexa
- But What Makes Dream Special?
- Talk, Run, Build!
- Conclusion and What’s Next
Dream: A Powerful Word
Hello and welcome! Last year, on Sep 3, 2020, we wrote these very lines in our blog:
At DeepPavlov, we have a Dream. Our Dream is to make AI assistants to improve the lives of every human. Whatever we have today in the form of AI assistants, is merely the dawn of what is yet to come.
Imagine AI assistants being capable of understanding us, and talking to us. Imagine them learning from us and teaching us. Imagine them being our trusted assistants. Imagine them doing everything we want. Imagine them empowering us to develop personally.
But to build these assistants, you need a platform. Back in September 2020, we only showed a glimpse of what was yet to come, and published a demo website that enabled you to talk to our DeepPavlov Dream Socialbot originally made to participate in Amazon Alexa Prize Socialbot Grand Challenge 3.
It was a special moment for all of us. For the very first time, we’ve shared our vision for, and a demo of an open-source multi-skill AI assistant platform, with the world!
However, roughly in less than a week after our original announcement, Amazon announced the Amazon Alexa Prize Socialbot Grand Challenge 4, and our entire team went into crazy mode of this exciting competition!
And yet, on December 8, 2020, after giving a few talks about DeepPavlov Dream at NVIDIA GTC Fall 2020, ODSC West 2020 and the like, we’ve made a very small variation of Dream publicly available. Deepy 3000. Unlike the original Dream this one only had two skills and a few annotators.
Deepy, of course, was quite primitive compared to Dream distribution. Still, it was our very first multi-skill AI assistant available in both a web demo and as a source code. Now, everyone could build their own multi-skill AI Assistants.
However Deepy lacked most of the Dream’s annotators for extracting invaluable features from user utterances like entities, emotions etc., it lacked a proper framework for building skills, it lacked all of the advanced technology we’ve built for managing multi-skill AI assistants.
Earlier this Summer, our participation in Alexa Prize SGC4 has been completed, and we’ve returned to our original plan of open-sourcing DREAM Socialbot as a solid foundation of our DeepPavlov Dream. Still, we needed to do a lot of work to make our Alexa Prize Socialbot ready for the everyday usage beyond Amazon Alexa and AWS infrastructure. Fortunately thanks to our new partner, SAMSUNG Russia, we’ve got an awesome opportunity to tap into the local open source community. Through our participation in SAMSUNG COMMoN School (see announcement), we’ve organized a full track fully devoted to refactoring DREAM Socialbot into the DeepPavlov Dream — an open-source Multi-Skill AI Assistant Platform.
At the same time, we expanded our work towards supporting the developer story around DeepPavlov Dream by investing into Dialog Flow Framework, in particular DF Engine, our finite-state machine engine for scenario-driven skills, as well as into the Dev Tools direction.
And so, today, on December 30, 2021, more than a year since the original announcement, it is a big pleasure for all of us at DeepPavlov to announce that our newest version of DeepPavlov Dream open-source Multi-skill AI Assistant Platform is finally available to the developers!
So, What is Dream All About?
There are quite a few resources explaining what Dream is: original announcement (Sep 3, 2020), two Alexa Prize technical reports (2020, 2021), Deepy 3000 announcement, two recent blog posts (Modular Dream Socialbot, Difficulties of asynchronous pipeline, or why we use dp-agent).
However, the shortest way to explain what Dream is, is this:
DeepPavlov Dream is an open-source Multi-skill AI Assistant Platform.
It is built on top of the DeepPavlov Conversational AI Stack, and it is provided with a set of dev tools (one, DF Designer, is already available as an alpha version in the VS Code Marketplace, others will follow later next year):
DeepPavlov Dream uses ML & DL models packaged as “Annotators” to extract various features like entities, emotions, sentiment from user utterances. Some of these models come from our DeepPavlov Library while others were obtained from the research community or built from scratch. Next year, we’ll migrate these annotators into their own DockerHub images as well as build tools for easy construction of custom Dream distributions from scratch.
DeepPavlov Dream uses 2 engines of our entire Conversational AI Stack:
- Agent to orchestrate skills; Agent is the heart of Dream as it is a foundation of its multi-skill architecture;
- Dialog Flow Framework to run individual skills; it was born during Amazon Alexa Prize Socialbot Grand Challenge 4 and it makes it super easy to build your own scenario-driven skills for Dream.
DeepPavlov Dream is a robust architecture scalable to support usage from one user to myriads through Docker/Kubernetes as each of its components in the pipeline is stateless and can be easily scaled to your needs:
You can learn more about Dream from Dilyara Baymurzina’s seminal talk at AI Journey 2021: https://youtu.be/Ux7UJ1gVKTA?t=206
With DeepPavlov Dream, you can build your very own AI Assistant, like Amazon™ Alexa™, Google Assistant™, or Apple’s Siri™.
Positioning: Dream vs. RASA vs. Mycroft vs. Amazon Alexa
Ok, so you might be puzzled. If you can build your own multi-skill AI assistant with DeepPavlov Dream, how does that differ from existing approaches?
To start, Dream has a bit in common with a very popular open-source conversational AI framework, RASA. Like RASA, Dream has lots of ML & DL Models (we call them Annotators) that can extract myriads of features from the user’s utterance just like models in the RASA’s pipeline. Like RASA, Dream has its own engine for building scenario-driven skills (Dialog Flow Framework, more on that below). Unlike RASA though, Dream provides a native multi-skill architecture provided by DeepPavlov Agent enabling single brand identity built around multiple individual chatbots.
With its multi-skill architecture, Dream is similar to Mycroft AI. Like Mycroft AI, Dream provides a mechanism for building multi-skill AI assistants. Unlike Mycroft AI which is focused on enabling multi-skill AI assistants per se, Dream is built as an ever evolving open architecture to create next-gen AI assistants. We 😍 Conversational AI! More on that below.
So, with Dream you can build your very own Amazon Alexa-like multi-skill AI Assistant for yourself or for your organization. And you can deploy it on AWS, in your own data center, or, if it can handle that, right on your own PC! You control all aspects of what is shared with what components, and you pick any components you want within your own distribution. In short, you’re in control.
In this way, having your Dream is a lot like having your very own AI assistant (though it’d take some time to implement some base scenarios, like reminders; you can build them on your own, pick those made by the community, or you can ask us to build them).
So, to summarize, if you want to build your own Conversational AI Assistant for your ecosystem like Google or Amazon did, Dream is your choice!
But What Makes Dream Special?
Now that you’ve learned what Dream is compared to RASA, Amazon Alexa, and Mycroft AI you might wonder: what makes our Dream special?
Did I mention that DeepPavlov 💖Conversational AI?
This love is more than just our passion for building socialbots, chatbots and the like. After all, there are myriads of ways to make them. For us, the key goal is to figure out how these AI assistants should actually be done.
9 years ago Mikhail Burtsev, founder of DeepPavlov, gave an interview (in Russian) to PostNauka (PostScience) about his vision for AI assistants.
This interview, alongside with Mikhail’s efforts towards making his dream of open-source AI assistants, was a precursor for the inevitable. 5 years later, an ambitious project called iPavlov was born. It’s first product was DeepPavlov library, a TF 1.x Python-based library that made it super easy to build pipelines of multiple ML & DL models towards solving different NLP tasks like NER, Entity Linking, etc. Today, DeepPavlov Library has more than 300K downloads, more than 1M DockerHub image downloads, more than 5K stars on GitHub, and is under active development.
Yet Mikhail wanted more. In 2019, iPavlov project has been completed; commercial side turned into a spin-off while the lab got a new branding, DeepPavlov.ai. and under Mikhail’s leadership we continued our foray towards the future of the Conversational AI.
With Mikhail’s relentless focus on building more complex dialog systems, DeepPavlov.ai started working on an open-source Multi-Skill Conversational AI Orchestrator. Thanks to Alexa Prize SGC 3, this Orchestrator, now called DeepPavlov Agent,
dp-agent for short, became a heart of the original DREAM Socialbot. It now powers DeepPavlov Dream.
Mikhail and me gave a talk about the importance of Multi-Skill AI Assistants architecture at NVIDIA GTC Fall 2020:
But this wasn’t the end of the story. O contraire, we were just getting started!
Last September, during our planning of the DREAM 2 Socialbot for Amazon Alexa Prize SGC 4, Mikhail gave us a new direction. He believed that we need to integrate the concept of Goals into the AI Assistant. Abstract goals, like “to be heard”, “to get better through conversation” etc. These ideas strongly resonated with me.
After all, 7 years ago I gave my first ever TEDx talk; while it started with the problem of Functional Illiteracy, it was also the one where I’ve described the future of the AI assistants (in Russian).
Mikhail’s ideas on what we should do with the AI Assistants, combined with the passion to move towards AI assistants that can become our friends, advisors, mentors, teachers, inspired me and my colleagues to dig deeper into this direction.
Together with Dilyara Baymurzina, a team leader of our DREAM 2 Socialbot for Alexa Prize SGC 4, we gave a first talk about our vision for making a goal-aware multi-skill AI Assistant at NVIDIA GTC Spring 2021:
Later this year, we presented our vision for 3 Level Dialog Planning at Conversations AI 2021 Conference (video to be published on YouTube next year). Just earlier this week Dilyara completed her PhD where she laid the foundation for the Goal-Aware/Goal-Oriented Dialog Management. Yet it was clear that even more building blocks are needed.
Today, after all we’ve learned from our participation in Amazon Alexa Prize Socialbot Grand Challenges 3 & 4, we all at DeepPavlov want even more:
We want to understand how to make AI assistant persona believable. We want to understand how to profile our users and how to make AI assistants adapt to them. We want our AI assistants to be empathetic. We want them to learn about user goals and have their own. We want them to strategically plan their dialogs with us. We want them to care about us.
How to achieve all of that? There are no ready answers. It’s a process of discovery.
Earlier this year, as we were celebrating 3 years of DeepPavlov Library, I made a talk about about the Conversational AI Iceberg:
This is our understanding of what we should focus on at DeepPavlov. Core technologies.
And we’re not alone. Earlier this year, Gartner published their regular update on the Competitive Landscape: Conversational AI Platform Providers (CAIP). You can grab a complementary report from our colleagues at Kore.ai here. In it, they specifically outlined the evolutionary path of the CAIP from simple chatbots to Multi-Skill AI assistants:
Within the framework of the Federal AI Research Centre that was founded earlier this year at MIPT in collaboration with Sber, we at DeepPavlov.ai charted our path towards the future of Conversational AI. To enable our ambitious scenarios for the AI assistants, we’ve started and currently drive research across multiple directions, both applied and theoretical. Here’s a list of some of them:
- User & Bot Persona Modeling
- Affective AI (Empathy)
- Strategic Dialog Planning
- Controllable Response Generation
- Automated and Semi-Automated Dialog Graph Generation
- Knowledge Graph Extraction from Language Models
and many more directions!
In the coming years, we will continue working on these directions to drive the revolution in the Conversational AI field. The good thing for you is that we will use DeepPavlov Dream as our ever evolving open architecture for designing the next wave of AI assistants.
This also means that we will make these technologies available to you as part of our distributions, and you’ll be able to use them in your products!
Talk, Run, Build!
Talk to Dream!
To try out Dream, you can do one of 3 things:
- Talk to our installation of the Dream AI Assistant through our website,
- Clone repository, run the entire thing on your PC or in your datacenter, and begin conversation in command line,
- Clone repository, use our proxy to offload heavy computations to DeepPavlov datacenter, run the rest of the Dream on your PC, and begin conversation in command line.
If you want to try out the entire Dream on your PC beware it’s quite resource-hungry. 1 replica eats through:
- 4 NVIDIA GTX 1080 Ti (11GB) GPUs to run ML & DL models (Annotators),
- ~40GB RAM (mostly used by running the light version of the up-to-date Wikidata in memory),
- ~100GB storage (mostly used to store an up-to-date light replica of Wikidata)
To run Dream on your machine on CPU, use this command:
docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml up --build
If you want it to run on GPU, use this command instead:
AGENT_PORT=4242 docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/test.yml up
Finally, to run Dream via our proxy, use this command:
docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/proxy.yml up --build
Beware though, that by running Dream via our proxy, you’ll be using limited resources of the DeepPavlov Dream Cloud APIs. These APIs are practically the same Docker containers as ones available in the repository but running in DeepPavlov Datacenter. When you use our proxies, Dream uses special lightweight containers that redirect local calls to the heavy containers running in our cloud. However, your Agent and its accompanying Mongo DB are running locally on your machine.
In the coming months we’ll add quotas to make usage of the DeepPavlov Dream Cloud APIs more accessible to as many developers as possible.
Once you’ve started Dream, to chat with Dream, in a separate terminal tab run:
docker-compose exec agent python -m deeppavlov_agent.run
Enter your username and enjoy a conversation!
Try Other Distributions
Our goal for DeepPavlov Dream as a platform is to facilitate creation of new multi-skill AI assistants. We designed the entire platform around this powerful idea: AI assistants are, simply speaking, distributions of the shared components like annotators and skills.
There are multiple distributions already available. Most of them were created in the preceding year, while the last one, DREAM Socialbot, was created using an adapted version of our DREAM 2 Socialbot originally developed during the Amazon Alexa Prize Socialbot Grand Challenge 4.
Here’s an example of how you can run a very basic distribution of Deepy:
docker-compose -f docker-compose.yml -f assistant_dists/deepy_base/docker-compose.override.yml up --build
Learn more about running other distributions from our Readme!
Add ASR & TTS
If you want to talk to your Dream AI Assistant on your machine, use the ASR & TTS (provided by NeMo from our partner, NVIDIA).
To do that, you can add custom docker-compose file called
asr_tts.yml located in
/assistant_dists subdirectory to your
docker-compose command like this:
docker-compose -f docker-compose.yml -f assistant_dists/[DIST_NAME]/docker-compose.override.yml -f assistant_dists/asr_tts.yml up --build
After that you’ll be able to interact with your distribution through the ASR service by providing speech input via its
http://_service_name_:4343/asr?user_id= endpoint. Attach recorded voice as a
.wav file, 16KHz.
You can use either NeMo or Clone TTS service by sending batches of text phrases to its
Build Your Own Custom Distribution With Your Custom Skills
Now that you’ve tried our Dream, make the next step: build your own skills!
The entire workflow for you is relatively simple:
- Create a new branch for your own Distribution
- Copy a folder of an existing Distribution (e.g.,
assistant_dists/dream/docker-compose.override.ymlfor Dream) into
- Create a folder for your own skill in the
- Develop your own skill by following the instructions above
- Register it within the system (
pipeline_conf.jsonin your Distribution folder in
- Run it!
Build Your Own Skills Using Dialog Flow Framework
Ok, custom skills are quite fun. But if you want to make something production-worthy, it might make sense to use a more solid foundation.
Fortunately, Dream is provided with our latest and greatest Dialog Flow Framework (DF Engine, DF Addons, etc.). DF Engine is a finite-state machine dialog engine that enables development of the scenario-driven skills in DeepPavlov Dream (and also in standalone fashion, more on that later). Initially inspired by E-STDM, a finite state machine engine built by the Alexa Prize 3 winning team, Emora, Dialog Flow Framework was completely re-written earlier this Summer.
Dialog Flow Framework provides you with a native Python-based DSL that you can use to define the logic of your skill:
You can learn more about Dialog Flow Framework and its accompanying tech in this video. Denis Kuznetsov, creator of Dialog Flow Framework , presented it at AI Journey 2021 earlier this year.
There are plenty of DFF-based skills in DeepPavlov Dream repository already. Check for skills marked with [New DFF version] in the Skills section of the readme.
To build your own skill, you’ll need to use a
feat/dff/template_v3 branch of Dream repository. In it, we’ve made an instruction ready for you. These instructions include editing of your custom DFF skill as well registering it within your Dream distribution.
To run it, you can follow these instructions.
Design Your Own Skills Using DD-IDDE aka DF Designer!
In addition to manual coding of DFF skills you can also use the alpha version of one of the Dev Tools that accompany DeepPavlov Dream, DD-IDDE aka DF Designer. DF Designer makes it much easier to visually construct your DFF skills from scratch within the comfort of the trusty VS Code.
DD-IDDE, the original name of the DF Designer, stands for Discourse-Driven Integrated Dialogue Development Environment. Initially developed as part of our mid-term applied research towards Strategic Dialog Planning, DD-IDDE was proudly presented at CODI 2021 workshop at the top AI conference EMNLP 2021 earlier this year.
One of the coolest experimental features of DD-IDDE (hence it was announced at EMNLP 2021 first) is it’s unique recommendation system that enables developers to predict next user steps in the current dialog state. You can learn more about this feature in the paper about it published in ACL Anthology here.
DD-IDDE aka DF Designer is a VS Code Extension based on Draw.io (though in later versions we’ll migrate it to a custom React-based flow editing library).
You can download it from the VS Code Marketplace here.
To use DD-IDDE aka DF Designer to build your own DFF Skills, use it to open
scenario/main.py of your DFF Skill.
Here’s a short video introduction of DD-IDDE aka DF Designer made at AI Journey 2021.
To learn more about Dialog Flow Framework, follow these links:
Conclusion & What’s Next
Whoa, you’ve made it through! Congratulations!
This is just the very beginning of our journey of making DeepPavlov Dream, our open-source Multi-Skill AI Assistant Platform.
There are lots of things to come!
Tutorials & Online Courses:
- Better tutorials for designing Skills and Annotators in DeepPavlov Dream
- Tutorials on developing own Skill and Response Selectors
- Advanced Introduction into Dialog Management in Multi-Skill AI Assistants
- D3PO — a small replica of the legendary C3-PO robot built on top of the DeepPavlov Dream platform; it was made by our brave students from MIPT earlier this Spring
- Updated Deepy with DFF Skills in it
- Our Wiki-based documentation will slowly but steadily migrate to the professional-looking Read-The-Docs website
And many more things!
In the coming years, we’ll relentlessly experiment in the chosen and new directions, and bring the best of our innovations into the DeepPavlov Dream Open-Source Multi-Skill AI Assistant Platform.
We are deeply thankful to NTI (National Technology Initiative), MIPT — our university, our colleagues from the original iPavlov project (2017–2019), Amazon (for an amazing opportunity to participate in Alexa Prize Socialbot Grand Challenges 3 & 4), NVIDIA for our special partnership (and opportunity to talk at GTC!), & Facebook Research for long-term research partnership, SAMSUNG Russia (and especially Svetlana Yun!) for our partnership within the COMMoN School and SOSCON Russia 2021 earlier this year, and our entire DeepPavlov.ai team that made DeepPavlov Dream a reality.
Special thanks to Yury Kuratov, Idris Yusupov, Dilyara Baymurzina, Denis Kuznetsov, Pavel Pugin, and Fedor Ignatov for their technical and research leadership that helped to envision and develop DeepPavlov Dream.
We all are very humbled to bring our child, Dream, to this world. We are anxious to know what you’ll make of it, and we can’t wait to continue improving it towards the next generation of AI Assistants.
Start your discovery towards the future of Conversational AI right here:
https://github.com/deepmipt/dream — your starting point towards the future of Conversational AI!