How we designed a digital assistant for the business environment of Porsche

Next Visions
#NextLevelGermanEngineering
8 min readAug 19, 2018

Do you remember „Clippy“ from the old times of Microsoft Word 2000? Trying to be helpful, this assistant was rather annoying and users would disable this functionality as soon as possible. Fast forward a few years and you have multiple digital assistants available on the market which are used in everyday life. The most prominent ones would be Siri, Alexa, Google Assistant and Cortana to name a few. What led users to abandon “Clippy” but made them use the new assistants instead? And even more important: how do you design a digital assistant that is both pleasant and useful for its users?

To answer these questions, we need to start by asking ourselves: what is a digital assistant? Similar to a human assistant, a digital assistant is an agent with the goal to aid you in completing your tasks, ranging from writing your emails, playing your favourite song, to creating shopping lists. This is done by either supporting you in reaching these goals or completing the tasks for you autonomously. Digital assistants support different ways of interaction like chat or voice and they need to be able to recognize the user’s intention to trigger the appropriate action.

What does the service desk of the future look like?

Currently, all these assistants target your personal life (e.g. dictating text, setting reminders, calling people from your contact list) but none of them have made it to the business environment, yet. To face this issue, we have created our own digital assistant with a clearly defined goal in mind: we want an assistant to aid the internal service desk of Porsche. Many of us might recall situations in which we needed support with various IT problems for which we called a respective hotline. Stuck in a waiting line, we spent precious time idling just to be forwarded to an overburdened service agent. Now imagine the very same situation, but instead of being put in a queue, you are directly connected to a digital assistant which is able to recognize you, identify your problem, and solve the issue for you autonomously. The gains for all sides are easy to spot: as a customer, you get a fast solution to your IT-related problem and you can quickly get back to your task at hand. As a service desk agent, you are not overburdened with repetitive requests which can be automated to letting focus on meaningful tasks instead.

How do you build a digital assistant for business?

For better reference regarding the various tasks involved in this project, the chart in Figure 1 is presented.

Since users want to be able to interact with the digital assistant via voice, an interface must be created. This interface will be the telephone, so all calls can be forwarded to a controller program. The controller is able to connect the user to a chatbot and the conversation between caller and digital assistant can begin. In the process of the conversation, different kinds of information are collected from the user until there is enough information available to trigger an action. The chatbot forwards this information back to the controller and the controller is then able to call the Action Service, providing the necessary information to trigger respective actions. The Action Service is connected to different backend applications resulting in different measures to solve the user’s issue at hand (e.g., resetting passwords, documenting tickets, notifying the Service Desk).

The system features various unique aspects which we want to highlight. The first one is the speech interface as there are several reasons for using this approach. Natural speech is already used in the process of calling the Service Desk and we do not want the users to learn a new tool. Instead, we simply integrate the digital assistant in already existing processes, allowing the users to continue using their telephone. Moreover, spoken language in combination with body language is the most natural way for humans to communicate, it is in our nature to convey messages with our voices and bodies. Voice interaction not only allows a pleasant contact to the digital assistant, it is also more correct for analysis of input than for example chats or filling out forms. Having this interface allows for data collection too, since all calls and their transcription could be recorded for enhancing the language understanding model.

What is the major key to creating a successful digital assistant?

The spoken language has always spurred a lot of interest in art, philosophy and research: from creating poems, to analyzing grammar, we humans deal with the spoken word in different ways. Its fascination lies amongst other things in the fact that there is still no definite way to create a natural conversation. Over our lifetime, we simply learn to converse with others and follow general rules, but to this day there is still no program written on what these rules exactly are or how to learn these rules. This makes the design of a natural conversation between user and digital assistant so interesting.

Now the next questions are: how do we create a chatbot to support a natural conversation structure and also solve the user’s issue reliably? In contrast to other digital assistants, we do not want the user to utter commands to the chatbot but rather we want to extract information from a conversation, from the dialog between caller and digital assistant. This is the same way humans interact: we do not utter commands to each other, but rather converse and extract necessary information from a conversation. For realizing this within the digital assistant, we do not have a fixed conversation, instead we guide the user through different conversational paths, depending on the case at hand and the user’s input. To realize this, we implement a mixture of finite state machines with decision trees. Depending on the user’s utterance, we not only recognize the context which it was said in, but we also extract necessary information to later on trigger certain actions. To make the conversation more natural, our digital assistant is aware of its boundaries: if an utterance was not understood, we kindly ask the caller to repeat him- or herself.

Yet, there are more reasons leading to the decision to build up the chatbot ourselves: since we have a clearly defined problem domain and a specific goal in mind, we need to create the conversation to support Porsche-specific phrases. Moreover, the context and terms used in this company do differ from the same terms used in other environments. Take the word “bank” as an example: it can be used as the financial institution, but it can also be the building where this financial institution is offering its services. It can be used as a verb or combined with the word “river” it has a completely different meaning. Just from these few examples of a single word, it is clear that the support of a custom company language is essential. This led us to the decision to create our own context identification algorithm, to not only allow us to gain the necessary knowledge in the process, but to also find alternative ways to solve our issues. In addition, we were able to find and fix errors and bugs since we could always identify and spot what is going on instead of relying on a black box solution.

Since the digital assistant is supposed to help humans in their everyday tasks, human factors for the utilization of this technology must not be neglected: trust in our prototype must be established for new users to adopt the technology as current experiences with other assistant systems may not have been pleasant for the users so far. Think for example of Clippy or your last experience with an automated chatbot service. This issue was tackled by providing different paths to take in the conversation but also transparency: the user was able to perceive the technology’s boundaries but we also allow the user to reset every action with a keyword.

The most important lessons learned from our approach

During this project, we have learned a lot and there are quite a few lessons we want to share with you. Human conversation is a very complex topic and even in a clearly defined scope the conversation took many iterations to make it to feel natural and to attain the goal in mind. To give some simple examples of this: during a regular conversation we might remain silent instead of answering, changing our mind during an utterance or even forgetting what we wanted to say to eventually start the conversation anew. Covering all these cases and allowing a computer to deal with it is difficult to solve. We ought to never forget that technology is supposed to support us, thus never taking the human out of the equation. Should the digital assistant run into an unexpected error or simply not be able to solve the problem, the caller will be forwarded to a human agent. Designing the digital assistant not only involves the architecture, but also cosmetics of the interfaces: speech synthesis needs to be properly adjusted by considering tonality and speed voice.

Utilizing AI makes our digital assistant a data-driven project leading to the more obvious lessons from such projects: data preparation, filtering, cleaning and pre-processing make up a big part of the effort needed to drive success in this project. Based on the data available, this may lead to some not-so-obvious issues: biases in your data lead to biases in your decisions. This can be something intentional, but in our case we had to re-evaluate our model and modify our data pipeline to face this challenge.

In addition to all these steps of implementing the digital assistant, it is crucial to also integrate the assistant into the infrastructure and processes of our company. From the very beginning of our approach, we had the goal in mind to seamlessly connect our digital assistant to the environment it would later be used in. This spans from the voice interface for telephones to the identification of callers in the conversation and having an instance trigger action after conversations. This is the reason why a digital assistant is much more than a simple chatbot: generating value is not simply attained by talking, it is generated by taking action.

All in all, we successfully implemented a digital assistant for the business environment which supports natural speech. The assistant is able to recognize the user, identify his or her problem and solve it for him or her autonomously. This is all done in a way that the digital assistant can be integrated into existing processes and supports a natural conversation as well as company-specific language.

Nikolaj Waller

Nikolaj Waller is Lab Scientist in the Porsche Digital Lab.

Thank you very much to the team: Ingo Brenckmannn, Askin Askin, Alissa Wilms, Cedric Wilting, Munodiwa Bore and Sergio Wölbl.

Please find more about inspiring men & women on Twitter, LinkedIn and Instagram.

--

--

Next Visions
#NextLevelGermanEngineering

There’s more to Porsche than sports cars // #NextVisions is a platform about smart technologies and the people that drive our digital journey.