Understanding Conversational User Interfaces - Part 1

Looking to the past, present, and future.

9 min readSep 5, 2023

Over the last year Schibsted Futures Lab has delved into the transformative potential of generative AI and natural language processing. From creating a synthetic weather forecaster, to building tools that assist our editorial staff, we’ve seen firsthand how these technologies are revolutionising the world of media production. Their impact extends far beyond content creation, however.

Our hands-on experimentation has shown that these technologies are poised to fundamentally reshape how we interact with our devices. Imagine user interfaces that are conversational, personalised, and context-aware — that’s the world that CUIs could usher in, making our digital interactions far more intuitive and inclusive in the process.

Our weather avatar — which is based of a volumetric capture of one of team members and is created using Unreal Engine’s Metahuman Creator. Animated automatically with audio-to-face-animations

Our approach

Over the summer we’ve been studying CUIs in different timelines — past, present, and future — to understand their strengths, weaknesses, and transformative potential. Beyond the practical, we’ve also delved into ethical implications using tools like the Tarot Cards of Tech to ask important questions like who gains power, who might be left out, and what societal changes might be triggered by broader adoption of these tools.

Evaluating technologies that are still emerging is a challenge, however. How do you evaluate and design for technology that may only be fully realised months or even years down the road? Our two-pronged approach of strategic foresight and technical prototyping helps us navigate this uncertainty. We use scenario-building to envision potential futures while prototypes show us how we can actively shape these futures. For an added layer of understanding, and to ground our experimentation, we also turn to science fiction.

CUIs have been a recurring element in sci-fi pop culture since the first android appeared on screen in the 1927 film Metropolis. While on-screen examples are designed for performative and storytelling purposes, rather than real-world utility, these fictional examples still offer valuable insights. Using a framework adapted from the book Make It So: Interaction Design Lessons from Science Fiction, we’ve analysed a range of examples, looking at their components, capabilities, and interactions. Using frameworks like Jakob Nielsen’s 10 Usability Heuristics and L.M. Sacasas’ 41 Questions Concerning Technology, helps to enrich our discussions and further informs our understanding of CUI usability and ethics.

The evolution of CUIs

Past

CUIs are by no means new. ELIZA, an early chatbot developed at MIT, caused a stir on campus in 1966 using simple pattern-matching to mimic human conversations. First-generation assistants like Siri (first released in 2011), Alexa and Google Assistant have added voice recognition, improved pattern matching using AI, and the ability to take limited actions on your behalf, but they still rely on scripting and manual mapping of user intent to actions.

Before Siri and Alexa, there was ELIZA.

For some applications, this programmed predictability is an asset. Repetitive inquiries, customer service interactions, and automated reminders benefit from the stability and consistency of pre-programmed responses. Until now the rigid logic of first-generation CUIs has been a technical boundary, and has limited their broader applicability and adaptability. Enter ChatGPT.

Present

The emergence of Large Language Models (LLMs) and tools like ChatGPT has radically expanded the capabilities of CUIs. These next-generation CUIs offer continuous learning, context awareness, and adaptability, allowing for truly open-ended, unscripted conversations. Together with user-friendly interfaces and low-code tools, these advanced features and capabilities are increasingly accessible to the wider public.

Recent enhancements like plugins and API-access are further broadening the capabilities of modern CUIs, connecting them to external data sources and services. Examples like ChatGPT plugins and AgentGPT exemplify this emerging future, and push modern CUIs one step closer to becoming truly agentive technologies — capable of interpreting our intentions, and increasingly, taking actions on our behalf.

These developments are coalescing around three core use cases: productivity, companionship, and entertainment. Changes to search engines like Bing and Google, and assistive writing and coding tools like GitHub Copilot are creating dramatic changes in our workplaces, personal productivity, customer service and education industries, with the potential to further upset business models in the media industry and beyond.

Of course, it didn’t take long for CUIs to enter our personal lives with their ability to mimic human relationships. Developing emotional connections with our technology isn’t a new pattern (remember that time people were marrying their holographic home assistants?), but CUIs take this to a whole new level. SnapChat sparked controversy by putting a hallucination-prone AI companion at the top of users’ friends list, and people who had formed intimate relationships with Replika AI’s virtual avatars were shocked when the service abruptly changed the personalities and responses of their companions.

The video game industry has an active stake in these developments, as highly personable and responsive avatars have immediate applicability as game characters. Tools like Unreal’s Metahumans have already made it easy and affordable to create lifelike human avatars — coupling these synthetic humans with next-generation CUIs will open up new avenues for interactive and open-world experiences. Nvidia has been leading the development of AI-enabled tools to help voice, animate, and breathe life into virtual characters and spaces.

Nvidia is leading the exploration of integrating GenAI tools into production flows for video games and spatial experiences.

Learnings from today’s landscape

The double-edged sword of generative AI

While first-generation chatbots strictly adhered to their programmed scripts, the generative AI models behind modern CUIs like ChatGPT introduce a new level of unpredictability. These models are essentially ‘prediction machines,’ generating responses based on observed patterns from their training data. As Inga Strümke notes, most of these models are trained on text pulled from the open internet (which is, let’s be honest, a fairly sensational place). As such, they have the baked-in goal of creating interesting and engaging sentences, which often results in unexpected and sometimes inappropriate responses.

This inherent bias in the training data can lead to unintended and sometimes risky conversations, such as SnapChat’s AI broaching sensitive subjects with young users, or Bing’s beta persona offering life-altering advice to journalists. As we give these tools increasing ability to act autonomously, it’s crucial to ensure that their behaviours align safely and effectively with human intent.

Usability

Traditional principles of usability developed for desktop and mobile computing require rethinking in the age of CUIs. Important aspects of a good user experience, like a clearly visible system status, intuitive command consequences, and low cognitive load are complicated by conversational dynamics. While the future potential of these interfaces is clearly evident, defining new best practices for usability is a critical step in realising that potential. High user expectations — fueled both by existing digital experiences and representations of CUIs in pop culture — complicate this further.

Managing expectations

Early attempts to overlay next-generation conversational interfaces on traditional digital platforms have revealed a gap between the CUI’s capabilities and user expectations. Companies like Mercari have launched CUIs to facilitate user interactions but often fall short in delivering a smooth experience. Issues range from poor comprehension of user queries to an inability to handle next-step tasks like generating shipping estimates, thereby making some CUIs less efficient than traditional graphical interfaces.

Reliability & Transparency

Steps are being taken to improve the reliability and agency of CUIs, including features that link to verified data sources or enable limited actions to be taken. However, the opacity of AI-driven conversational models creates new challenges. Unlike their first-generation counterparts, the inner workings of these models are not transparent. Mitigating measures, such as source-citation in Bing’s search engine, are important steps to increasing transparency and trustworthiness, but they also run the risk of building false confidence in AI-generated responses.

Future

Evaluating emerging technologies with uncertain futures demands open-ended imaginative exploration. Science fiction has long provided speculative visions that have influenced the design of real-world technologies. In the realm of CUIs, sci-fi films and TV serve as a rich source of inspiration, depicting both desirable and, more often than not, undesirable futures. These on-screen portrayals help us explore key characteristics and usability heuristics like agency, adaptability, presence, and personality. By studying these elements in fictional examples we can gain valuable insights into practical applications and challenges that we need to navigate when working with CUI technologies today.

‘Her’ (2013)

The film introduces us to Samantha, an emotionally intelligent operating system that adapts its language, tone, and personality based on her user’s familiarity and preferences. Examining the technology portrayed in the film can teach us a lot about designing audio-first, always-on personal assistants, and raises important questions about the trajectory of AI development. Samantha’s emotional intelligence makes her a compelling companion, but her independence raises concerns about the limits of AI agency user control. The dilemmas put forward in Her highlight the importance of striking an appropriate balance between AI agency and user control for future CUI implementations.

Make It So: Interaction Design lessons from Science Fiction, by Chris Noessel and Nathan Shedroff, as well as Noessel’s blog scifiinterfaces.com were a major influences in this work.

‘Star Trek: The Next Generation’ (1987)

Looking farther into the future, we see the same tension between control and agency. The Enterprise Computer exemplifies how limited AI agency and direct instruction can be applied to reduce unintended consequences and promote user accountability. User identity and access management become critical concerns when CUIs can be used to control complex and heavily armed spaceships. While often played for laughs, the Computer’s technical and emotionless interpretation of human commands also reveals the importance of context-aware and adaptable conversational interfaces.

‘2001: A Space Odyssey’ (1968)

HAL9000, the AI powered companion and assistant onboard the Discovery One, serves as a cautionary tale about what can go wrong when an AI’s goals aren’t aligned with human interests. HAL’s internal struggle between its base programming and mission-specific goals results in HAL perceiving its human crew as expendable obstacles to achieving its aims. This dystopic example emphasises the need for transparent decision-making processes, explainable technology, and human oversight.

‘Iron Man’ (2008) and the Marvel Cinematic Universe

Tony Stark’s AI assistant J.A.R.V.I.S. seamlessly integrates natural language processing and voice interfaces with predictive analytics, showcasing an effortless interaction that is finely tuned to Stark’s personal needs. This combination of capabilities enables Stark to simultaneously solve complex scientific problems, automate manufacturing, and pilot the Iron Man suit. However, when Peter Parker tries to use the system in “Spider-Man: Homecoming,” its highly complex and customised nature leads to funny yet dangerous misunderstandings. While J.A.R.V.I.S. effectively anticipates Stark’s needs, its struggles with a new user (i.e. Spider-Man/Peter Parker and “Instant Kill Mode”) reveal the limitations of overly personalised, highly technical and opaque systems. This highlights the challenges of designing Conversational User Interfaces (CUIs) that are accessible and transparent to various users and that clearly communicate the consequences of commands.

Key learnings from Sci-fi

Agency vs Control

Defining the right level of AI agency is essential for balancing user-friendly experiences with the risks of over-automation and misaligned goals. While high-agency and predictive action taking may increase convenience, without adequate human oversight this can also lead to unintended consequences.

Adaptive vs Standardised

There is a need to consider the trade-offs between personalised, adaptive CUIs and more predictable, standardised interfaces. For each application of a CUI we will need to consider how to meet unique user needs without sacrificing accessibility.

Alignment of Goals

Ensuring that any programmed biases, business objectives and end-user needs are aligned is critical if we want to avoid unintended consequences. Understandable behaviour and transparent, well-aligned objectives are vital for CUI development that respects user privacy, autonomy, and builds trust.

While science fiction often stretches the imagination, it also raises very real questions about misalignment, user expectations, and ethical considerations in working with AI that we need to answer here and now. Grappling with these important questions helps us aim for a future where CUIs offer enhanced user experiences, contribute to efficiency and create new value, all while maintaining ethical integrity.

What comes next

This first step into the realm of Conversational User Interfaces has illustrated some of the the complexities involved with this rapidly evolving technology. By blending an analysis of current and past CUIs with a foray into the imaginative world of science fiction, we’ve begun to contextualize the signals and trends that will shape the digital conversations of tomorrow.

Futures Lab uses The Future Today Institute’s Seven-Step Forcasting Funnel structure to our strategic forsight work.

As we’ve teased out some future possibilities for CUI development, a set of critical uncertainties has emerged — uncertainties that we think will fundamentally influence the trajectory of conversational technologies. Our next step? To bring these abstract questions into tangible scenarios and speculative product concepts.

Stay tuned for Part 2 of this series, where we’ll be diving deep into these scenarios and concepts, and explore how they can inform our own technological experimentation and shape our digital futures.

Authors: Chris Pearsell-Ross & Anders Grimstad