The Conversational IDE

An Integrated Development Environment for Building Human-Computer Conversations

PullString
Feb 23, 2017 · 8 min read

By Martin Reddy

Software developers are accustomed to integrated development environments (IDEs) — such as Xcode, Eclipse, or Visual Studio — to develop applications. These tools let you navigate entire software projects visually, support convenient editing of code, manage associated media assets, integrate with build systems, and offer built-in debugging capabilities.

There are also specialized IDEs for certain fields, such as Unity, Unreal, or Blender for game and VR development. At PullString, we think a similar capability is equally important for the evolution of the field of conversational AI: to make it easier to create, debug, and maintain the combination of code and content that’s needed to make conversational agents, or chatbots. In other words, we believe the field needs a Conversational IDE.

The Need For a Conversational IDE

There are several reasons why we feel that a conversational IDE is important for managing the content used to build human-computer conversations:

  1. Easily manage all assets for a project. Content management at scale is hard. We’ve even heard from other developers that maintaining a nontrivial chatbot can be more difficult than managing a large enterprise code base. Taking the Hello Barbie experience we developed with Mattel as an example, this project contains 20,000 intent rules, 8,000 lines of dialog, 3,500 synonyms, 50 recording sessions with the voice talent, and dozens of interrelated topics. And all of this content evolved over time, based on internal reviews and testing, with several team members working together at the same time. Maintaining all of this content by hand would have been a very complicated and error prone task.
  2. Enable teams of technical and creative users. IDEs like Unity allow not only multiple developers to work on a project, but also artists, animators, and other creative users. In a similar fashion, crafting the personality, tone, and voice for a great chatbot requires the hand of a talented creative writer. A good conversational IDE will let developers write the underlying logic for a chatbot, but also make it possible for creative users to contribute their skills to the project as well. This will be critical for the development of engaging experiences as the field of conversational AI grows.
  3. Visualize the conversation flow. You should be able to visualize your conversational content to better understand the branching flows, local contexts, and interconnection of topics. This can be done using visual tools like graph editors or hierarchy browsers. Even for projects that use machine learning to evolve the experience over time, there’s still an overarching structure to maintain. For example, most machine learning efforts today focus on intent recognition, but the intents themselves must still be created, the contexts in which they are valid defined, and the actions that result when they match specified. All these attributes can be presented usefully and visually in a good hierarchy browser.
  4. Test and debug your conversations. It’s important to be able to debug your content before releasing it to users, just as you would debug your code before shipping it. You want to be able to trigger any part of a conversation, understand why any given response was output, inspect the values of state variables such as entities, and even change those variables to simulate different paths through the experience. You also want to be able to create unit tests for your content that can automatically executed to ensure that the behavior of the content has not been changed unexpectedly by recent updates.
  5. Decouple content authoring from deployment. An IDE that can represent and edit generic conversational elements will let you focus on the content creation task without having to worry about which platform you are deploying to. The analogy to programming language IDEs is that you can use the same tool to edit and debug C++, Objective-C, or Java code. It’s just the deployment phase that will target a different compiler. Separating authoring from deployment for a conversational IDE means that you could direct the same content toward different platforms with minimal changes. It also encourages the surfacing of reusable design patterns for representing common conversational structures.

PullString Author as a Conversational IDE

The PullString Author IDE showing a Web Service implemented in Node.js

At PullString, we’ve invested more than 50 engineer years of effort to build just such a conversational IDE. We call this product PullString Author and we’ve architected it to provide a general conversational editing tool for everyone in the field; one that can be useful to both content creators and software developers alike. Some important aspects of Author that we’ve focused on to make it a general development environment are as follows:

  • Text and Voice. Author was designed to handle any kind of conversational interaction, be it text based like Facebook Messenger, audio based like Amazon Alexa, or graphical based like a Unity VR game. With the Professional Edition, we’ve built out a robust audio pipeline to let you record audio clips or generate synthetic speech for your bot’s dialog, and you can also associate your lines of dialog with character animations in an external game or VR system.
  • Conversation Visualization. You can view all your dialog content in an optimized hierarchical browser. Author lets you see your entire conversation, in context, with support for dialog management features such as interjections, segues, and time-based responses. It’s worth noting that there are several popular intent engines out there, like API.ai, LUIS, Wit.ai, or Lex, and these show you your content as a list of intents and entities. But if you’re trying to create a truly conversational experience you want to see your content from the point of view of a hierarchical conversation flow, not a flat list of all the things your bot can understand.
  • General Inputs and Outputs: Author was designed on top of general principles that should allow it to map to many different ways of representing a synthetic conversation. It supports general input events, beyond just user inputs; arbitrary behaviors that can be triggered, beyond just lines of dialog; an extensive concept of conditions to tailor the output to the user; the ability to plug arbitrary code into the experience to implement custom logic; and a way to represent user inputs that supports both rule-based AI or machine learned intents.
  • Optimized for your Operation System: Like most other IDEs out there, PullString Author is a desktop app (available for Mac OS X and Windows). This lets us provide an optimized experience for your native platform that can handle very large project sizes. It also means that you can use Author while offline and you have access to all your data right there on your computer, not locked into some website’s cloud service.

Supporting Creative Users

We originally conceived of PullString Author as a tool to let non-technical users to craft complex computer conversations. In fact, it’s possible to design, write, and deploy a large chatbot without writing a single line of code. To effectively build a tool like this requires a keen understanding of the workflows and needs of creative users. As such, Author offers the following functionality to empower these users:

  • Production Pipeline. Writing dialog for a chatbot can involve multiple iterations of feedback in order to create brand-appropriate in-character responses. PullString allows lines of dialog to be reviewed and approved, to have arbitrary notes applied to capture feedback, to have summary reports created, and to put in-progress content on hold until it’s ready for prime time.
  • Audio Support. Creating professional-grade audio-based experiences requires being able to integrate with voice talent recording setups, track recording sessions and know which lines have been recorded, batch import audio from the editing process, and being able to conform dialog to the final audio. PullString provides all of these capabilities, whilst also making it easy to quickly record scratch audio from your computer’s microphone.
  • Team Working. The Professional Edition of PullString stores all content in the cloud with content edit locking and one-click syncing of all changes to the server. This lets teams of creative users work together on different parts of a single project with the confidence that they won’t overwrite or lose each other’s changes.
  • Access to Developer-Only Services. Author integrates with several commercial intent engines, such as API.ai, LUIS, and Wit.ai. These services provide simple web-based interfaces to train intents, but in order to use the resulting models you have to be a programmer. However, by integrating these services with the PullString IDE, it’s now possible for non-technical users to leverage the power of these intent recognition systems without having to know how to code.
PullString Author showing some of the features for building audio experiences

Supporting Software Developers

As PullString Author has evolved, we have kept a continued focus on providing a powerful tool for non-technical users, however, we’ve added a lot of support for software developers to contribute to these conversational projects as well. Some of the specific features we’ve added to support developers include:

  • Embed Code in Content. You can add Node.js code at any point in a conversational flow to perform custom logic that can read and write the underlying entity state. For example, this lets you interface with a remote database or API so that a bot’s responses can be generated on the fly. And of course you can specify your preferred source code editor, whether it be vim or emacs. (Sublime works too.)
  • Scripting API. You can write Python scripts to automate the querying and modification of content within PullString Author. For example, you can write code to import data from your own sources, or you can write exporters to output the content to your own preferred file format, be it AIML, Botscript, or something else.
  • Integrated Debugger. Author has a built-in chat debugger so you can confirm that your content behaves as expected. You can drag and drop content into the debugger, access the current entity state and also modify that state. The debugger also lets you simulate different platforms and specify the date/time to debug date-specific content. We also show any errors in your code when it runs and surface any log output from your scripts right there in the chat debugger.
  • Web API & SDKs. You can publish your content to PullString’s cloud infrastructure and take advantage of our Web API to plug conversational capabilities into your apps, websites, or IoT devices. It’s as easy as sending the user input as text or audio and getting back a list of dialog responses or actions for you to execute. We also provide several SDKs to make it even easier to access the Web API directly from environments like Python, iOS, Android, JavaScript/Node.js, and Unity.
A fully-functional chatbot in a dozen or so lines of code using the PullString Python SDK

Toward the Future

It’s worth noting that the maturity of many fields is reflected in the quality of the tools that have evolved to support those fields. Developing computer games 20 years ago meant writing your own OpenGL or DirectX code. Then full-featured C/C++ rendering engines, like Unreal and Quake, started to appear to make that process easier. Today we have powerful IDEs that support creative and technical teams working together on large gaming titles.

Building deep, rich conversational experiences requires multiple people writing, maintaining, and testing lots of content and code together. This content management problem is a significant one in and of itself, so we believe our community needs more powerful and flexible tools to help us all improve the fidelity of human-computer conversations. As part of this, we’re committed to continue working hard to make PullString Author the most powerful conversational IDE we can because we believe this is a critical need for the field of AI.

Check out our ebook “Create Convincing Computer Conversations,” which includes a thorough walkthrough of chatbot character development, writing and scripting advice, and writing prompts to sharpen your skills.


Originally published at www.pullstring.com.

PullString

Written by

Building solutions at the intersection of creative expression & AI to help people talk effortlessly with technology that surrounds us.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade