Authors: Daniel Kornev, Mikhail Burtsev, Fedor Ignatov
Update (November 13, 2020): added two more features to our list of updates for DeepPavlov Library plans for Q4 2020.
We’re fast approaching the end of 2020, and we’re excited to share our perspective on this year as well as what lies ahead for DeepPavlov in this year and beyond!
At DeepPavlov, our focus is on building Conversational AI technology stack towards solving the Holy Grail of NLP: computers should achieve human-like comprehension of texts/languages. When this is achieved, computer systems will be able to understand, draw inferences from, summarize, translate and generate accurate and natural human text and language.
Achieving this is a truly hard task. While recent advancements with GPT-3 showed some exciting developments, our experience from participating in Amazon Alexa Prize Grand Socialbot Challenge 3, as well as experiences of our fellow university teams taught us that managing a meaningful dialog with the user can’t be solved with generative models alone. With all the progress across Conversational AI space, it might seem that the first 99% of our problem is solved, and we now need better dev tools, and processes, and other things. In practice, while having dev tools and processes is essential, investments in the core technology are inevitable.
At DeepPavlov, our investments are spread across 3 parts of the technology stack, including Library, Dream, and Agent. All of these projects are focused on enhancing and strengthening our Conversational AI stack, with each contributing corresponding components:
DeepPavlov Library is a foundation for our framework. It contains basic NLP components like NER, Entity Linking, KBQA, Go-Bot, and others. In our Conversational AI stack, these components:
- are used as standalone services used by skills (e.g., KBQA, Wikidata Parser),
- provide framework to build goal-oriented skills (Go-Bot),
- are used as generic annotators (e.g., NER, Entity Linking, Emotion Analysis, etc.)
DeepPavlov Dream is a set of our default goal-oriented and chit-chat skills, as well as a number of demo AI Assistants built using components from Library and managed by DeepPavlov Agent. We’ve announced Dream on September 3, 2020.
At the same time, DREAM is also a name of our Alexa Prize 2019 Socialbot. You can learn more about original DREAM’s architecture at our DeepPavlov Dream site.
You can play with DeepPavlov Dream’s AI Assistant Beta here. You can play with the first bare-bones demo here. You can see how a simpler multiskill AI Assistant was built using DeepPavlov Dream in its showcase at our first Community Call earlier this month:
DeepPavlov Agent is our multiskill Conversational AI orchestrator that coordinates the entire Conversational AI pipeline of the AI Assistants. It incorporates annotators, skills, Skill & Response Selectors to provide a coherent experience to its users.
In the first three quarters of 2020, we’ve built and open-sourced our components that work with Knowledge Graphs, including KBQA and Entity Linking. PyTorch support for the first time was added to several DP models in the Library. Go-Bot, our framework for building goal-oriented skills, got a boost with initial though partial support for RASA config files. NVIDIA NeMo ASR/TTS components support was added to DeepPavlov Library, too. Other cool things include but not limited to a syntax parser model, pre-trained NER-based Model for Sentence Boundary Detection Task, a new hybrid NER model, etc. We’ve also continued shipping fixes and updates to existing components.
In 2020 Q1-Q2, our focus was to compete in the Alexa Prize 2019 with our DREAM sociablot as long as possible. We’ve successfully reached the Semi-Finals of the Challenge, but didn’t make it into the Finals.
After that we’ve focused on adapting our DREAM Socialbot to run in the public; we’ve replaced a number of Amazon-specific services with our own open implementations, including but not limited to Dialog Act classifiers, Q&A, and so on. We’ve shipped a public demo of DREAM socialbot alongside with the first public demo bits in a dp-dream-demos repo on September 3rd, 2020. You can learn more about this announcement here.
In 2020 Q1-Q3, our focus was to bring the necessary changes into the Agent to support DeepPavlov Dream’s public A.I. Assistant demo. These include support for saving ratings for dialogs, small improvements under the hood, and so on.
In Q4, we will continue work on all three components of our Conversational AI stack. For the first time, we will go through a refactoring process of the DeepPavlov Library; with that, we will outline our top ML models that we plan to continue supporting, and make the rest unsupported by us. They will still be a part of our Library and can be supported by the community if there is a wish.
In addition to Library, Dream, and Agent, we will also put an emphasis on Dev Tools. During Alexa Prize 2019, we’ve built a number of supporting dev tools, and our goal is to re-work and share them with you to aid you in developing AI Assistants using DeepPavlov products.
We will also put some effort in updating our documentation to reflect the addition of new components, as well as changes made to the existing ones.
1.1. New Components
Intent Catcher — originally developed to aid in quick intent detection in DREAM Socialbot built for Alexa Prize 2019, this component will be shipped as part of DeepPavlov Library.
[UPDATE] Hugging Face Datasets Support — Hugging Face library includes over 160 free and open NLP datasets. We plan to add support for training DeepPavlov models with datasets made with HF format.
[UPDATE] Support for metrics tracking with Prometheus Middleware — Prometheus is an open-source toolkit for monitoring and alerting based on embedded times-series database, a query DSL and various mechanics for scraping metrics data off API endpoints. As we use a growing number of DeepPavlov Library components in the DeepPavlov Dream AI Assistant Demo, it is an imperative for us to support industry-standard methods for setting up and collecting metrics of our services. With this update you will be able to augment DeepPavlov Library REST APIs with Prometheus-based metric to see:
- total number of requests;
- latency percentiles in milliseconds;
- memory utilization.
KBQA, Entity Linking, Wikidata Parser — these components were originally developed as a monolithic component, but will be decoupled from each other, to make them available as semi-independent components. This is extremely handy when you want Wikidata Parser or Entity Linking to be available as standalone parts of the larger AI Assistant deployment. This update will also facilitate development of the Knowledge Graph-driven skills for DeepPavlov Dream.
- Wikidata Parser — it enables working with either online Wikidata APIs, or downloading and working with offline Wikidata Knowledge Graph. In the nutshell, it provides a SPARQL endpoint, as well as the mechanism for downloading Wikidata as an HDT archive, and then either working with it from the disk drive, or by uploading it into the RAM.
- Entity Linking — while it was already available as a standalone model as of our 0.12 release, its new version will be available for a standalone KBQA component, and can run inside the same larger AI Assistant deployment.
- KBQA — this component will now work with both Wikidata Parser and Entity Linking as part of the same larger AI Assistant deployment.
As mentioned above, DeepPavlov Library will go through a refactoring process. In the end, a number of rarely used ML models will be moved to the Deprecated list, and moving forward won’t be supported by DeepPavlov product team. However, we welcome our community to pick them up if there is an interest to support them. A separate blog post will be published with the list, and we’ll gather feedback before announcing our final decision on the refactoring.
2. Dev Tools
Dev Tools will be initially technically shipped as part of the larger DeepPavlov Library. These tools will include all of the tools we’ve built during Alexa Prize 2019, and will aid you in building multiskill AI Assistants using the DeepPavlov Conversational AI stack. These tools initially will be shipped as Beta, and you’ll be able to provide your feedback to us through our Forum, Telegram Group, as well as our new regular Community Calls.
Oh, and we’re happy to say that we’re hiring to help us in building DeepPavlov Dev Tools!
2.1. New Dev Tools
Intent Editor — developed to further simplify adding new intents and managing the existing ones for our upcoming Intent Catcher component in the DeepPavlov Library, it will be our first GUI dev tool. We will write a separate blog post explaining the ideas behind it, and we’ll be happy to hear your feedback on it.
Dialog & Session Analytics Dashboard — originally built as a quick tool to aid our Alexa Prize 2019 DREAM socialbot’s team in analyzing dialogues and overall trends, this tool has been significantly reworked and enhanced during this Summer to enable monitoring of our production Dream Demo. We are looking for an opportunity to share this tool with you, hopefully in Q4 of 2020.
Go-Bot — our framework for building goal-oriented skills. In Q4 it will get a proper integration with a public version of our DeepPavlov Dream-based AI Assistant currently available at our demo website and via Telegram. This integration will allow you to create new goal-oriented skills and easily integrate them with the rest of the platform. We will also continue implementing support for RASA DSLs (v2 format), allowing you to seamlessly transition your RASA bots to Go-Bot framework. This support will be extended from dataset generation based on stories.md, nlu.md, and domain.yml originally shipped earlier this year, to use of nlu.md for intent and slot filler training, as well as basic form-filling. We will also add preliminary support for custom actions.
While we don’t specifically focus on building new skills in Q4, our work on Go-Bot and KBQA will lay down a foundation for rebuilding our existing goal-oriented skills powered by Go-Bot framework and KBQA. In Q4 we will begin a process of transitioning existing goal-oriented skills to this framework. Exact skills are TBD.
3.1. New Annotators
Entity Linking — a component that was originally shipped inside KBQA in the DeepPavlov Library, will now be properly integrated into our DeepPavlov Dream AI Assistant Demo. It will provide annotations to all user utterances, providing links to the detected (and disambiguated) entities from Wikidata. This will lay the ground for building the next generation of the Knowledge Graph-driven goal-oriented skills.
3.2. Skill Improvements
Factoid QA — it’s our skill that enables users to ask factoid questions using natural language and get responses based on Wikidata. It will get minor updates to catch up with our latest changes to KBQA, Entity Linking, and Wiki Parser.
Book Skill — our scenario-driven skill focused on supporting lightweight conversations about books. It will be updated to re-use Wiki Parser and Entity Linking components, and will be a poster-child for our Knowledge Graph-driven goal-oriented skills.
3.3. Demo Website
Share Dialog — this is a new experience that will allow users to share conversations they’ve had with DeepPavlov Dream AI Assistant Demo with others. In addition to that it will provide visualization of different components within the system.
Widget for DeepPavlov.ai — it will enable our website visitors to talk to DeepPavlov Dream AI Assistant demo to chat w/o need to visit our demo website.
While our long-term goal is to open source our DeepPavlov Dream AI Assistant, in the meantime we will continue building and open sourcing smaller demos of the multiskill AI assistants. These examples will initially focus on three areas:
- Multiskill AI Assistant featuring an AIML-based chit-chat and handcrafted goal-oriented skills
- Demo of Multiskill AI Assistant with goal-oriented skill rewritten using our Go-Bot framework
- Demo of Multiksill AI Assistant extended with the Factoid QA skill
Deepy 3000 — a demo of a multiskill AI Assistant originally shown at our NVIDIA GTC Fall 2020 talk will get an accompanying blogpost with detailed explanation. Its bits are already available.
Deepy 3000 Advanced Demo — an updated version of Deepy 3000, with Harvesters Maintenance Skill re-written using our Go-Bot framework, will also become available on our repository, and a separate blogpost will be published alongside with it.
Deepy 3000 Factoid Demo — another updated version of Deepy 3000, will be extended with a Factoid QA Skill from the main demo. By popular demand, we will ship a simple version of a factoid skill to show how one can use our KBQA component from the DeepPavlov Library within a multiskill AI Assistant
Configuration Simplification — In Q4, we plan to simplify the process of defining configuration of our Conversational AI Orchestrator, DeepPavlov Agent. Our goal is to provide a number of default configurations, and to make a separate, smaller and simpler configuration file, to aid in defining and controlling the pipeline. For instance, you won’t have to define intricate details of how default components should be connected to the DeepPavlov Agent. Instead, you’ll have a smaller list of the default component names, and by adding or removing their names you’ll tell the system which of the default components it should actually run. This is a work in progress area, and we hope to show first results of these experiments by the end of Q4.
AWS Deployment — while our initial DeepPavlov DREAM Socialbot worked in AWS, the newer one currently available at demo.deeppavlov.ai is deployed on-premises. We will write examples of, and a How-Do-I docs that explain how to deploy DeepPavlov Dream-based Multiskill AI Assistants in the AWS cloud.
2021 Roadmap Priorities
While we are still finalizing our 2021 Roadmap, we would like to use this post as a chance to share our thoughts about the overall direction.
We will continue migrating custom components built during the Alexa Prize 2019 Challenge into the Library. We will also continue our work on supporting development of Knowledge Graph-driven skills, as well as adding support for custom knowledge graphs.
Our focus for 2021 will be to add new Debugging tools to aid in developing and debugging multiskill AI assistants built using the DeepPavlov conversational technology stack.
We will also continue investments into our Go-Bot framework for building goal-oriented skills:
- Support for our Intent Catcher to minimize efforts required to define your intents
- Basic set of types for a slot filler using DeepPavlov Library’s NER component
- Extended set of Wikidata-based types for a slot filler using our Entity Linking component
- Multi-intent understanding
- Oh, and with our work on Knowledge Graphs, this is just the beginning, so stay tuned!
Another area for investment is designing and developing a mechanism to share, publish, and distribute skills built for DeepPavlov Dream.
Our primary goal is to open-source our DeepPavlov Dream AI Assistant demo. As the process of replacing legacy goal-oriented skills with the Go-Bot-based ones is currently underway, the final list of the components will be decided upon in 2021.
In addition to open-sourcing Dream, we will focus on supporting proper multi-intent understanding within Dream. This is also a work-in-progress area, and we’ll be happy to share more details in the coming months.
We will focus on enabling basic support for integration with the third-party systems, like IVRs, IM channels, and so on. Our intention is to facilitate support for the existing systems (e.g., UIB); we will also release an example of a routing gateway we’ve used to add support for Telegram channel in the DeepPavlov Dream AI Assistant demo.
We will also provide detailed information about building CI/CD process around DeepPavlov Agent and Dream.
Our Dream is to make AI assistants to improve the lives of every human.
Imagine AI assistants being capable of understanding us, and talking to us. Imagine them learning from us and teaching us. Imagine them being our trusted assistants. Imagine them doing everything we want. Imagine them empowering us to develop personally.
Our position as an R&D organization allows us to balance between doing fundamental research and transferring research technology to our growing product group.
2020 is turning point for DeepPavlov.ai. After going through a massive re-organization announced back in February we’ve devoted a lot of time to building our open-source Conversational AI technology stack. Participation in Alexa Prize 2019 allowed us to build a first working DeepPavlov-driven socialbot, and then deliver a demo of it for the general public. Our work on Go-Bot and support for Knowledge Graphs is coming to fruition, allowing us to build an even stronger foundation for 2021.
We are super excited to share all we’ve done in 2020Q1-Q3, and we can’t wait to deliver on our product roadmap in 2020Q4 and beyond!
Join us in our journey in making happen our Dream to make AI assistants that improve lives of every human!