The conversational future of prompts

Gerardo Sanz
Newspective

--

Or why it is easy to envision the success of any human-machine interaction that gets us closer to a real conversation.

Just Because it’s Yellow it’s not Gold

Conversations have always been the natural way for humans to get information from each other. This has been true since the evolution of language, which occurred somewhere between 200,000 and 70,000 years ago (quite a gap). Today, we still use conversations as our primary method of interacting with other humans. Conversations, speech, talking, language, and words are immensely important for humans.

Now, large language models (LLMs) provide interactions with humans that “look like” conversations. It may seem that we are already talking to machines and that they almost understand us. However, while we at newspective are excited about AI, we are also very aware of its current limitations. Let me give you an example.

An Example

In a recent chat with a friend, I happened to talk about a very futuristic scenario: a comprehensive conversation between myself and a machine, an AI assistant called Alex. It goes like this:

ME: “Hey Alex! I need some screws to put a shelf on the wall. ”

ALEX AI: “Did you already buy the shelf? ”

ME: “Of course I did! ”

ALEX AI: “Aren’t the screws provided? ”

ME: “They were, but I don’t know where I left them. ”

ALEX AI: “Do you know the name or the ID of the shelf? ”

ME: “Of course not! They have these stupid, unintelligible names! ”

ALEX AI: “Can you tell me the screw number that the shelf is using?

ME: Let me check the instructions… Number 8!”

ALEX AI: “Will you need anchors? ME: Mmm… Actually, yes, I will. ”

ALEX AI: “Are you planning to put the shelf in your home office? ”

ME: “Yes.”

ALEX AI: “There are two types of walls there: bricks and plasterboard. Which one will you use to put the shelf on? ”

ME: “You know what? Forget it. I’ll leave the books on the floor.”

Can you tell how this conversation is different from the one you can actually have with an LLM today?

Where We Really Are

The real state of the art looks very different from the previous example, as many companies are starting to realize. McDonald’s is ending its test of employing artificial intelligence chatbots at drive-thrus. While they still see AI as the future of the industry, the technology has also resulted in viral videos of incorrect orders. Air Canada will refund a customer for a service that its chatbot promised would be available but wasn’t. And the list goes on. Why?

The short answer is that they are not fully reliable for businesses. Bots based on LLMs have a hallucination rate between 3% (a suspiciously optimistic minimum) and 20%. This means that 3% (if you are among the optimists) to 20% of your interactions will go wrong. If companies are accountable for the errors that their chatbots generate, they really need to be cautious with its implementation. Lawsuits against these bots are starting to emerge, and for now, customers seem to be winning.

Glazing Features vs. the Incomparable Depth of Comprehension

If you ask people, like my friend, many will think that a comprehensible conversation with a bot is not only possible but is already happening. The general perception is that the technology is far more advanced than it actually is. And don’t get us wrong. It is amazing. But you really need to understand its limits.

We humans are easily impressed by things we struggle with, like complex calculations. That’s why features like an app measuring the size of your luggage with your phone camera to decide whether you can take it on the plane or augmented reality positioning your desired sofa in your empty living room impress us.

But having a conversation… that doesn’t seem like such a big deal, right? Anyone can do that. Well… guess what? Having a conversation is such a complex task that we are nowhere near having one with a machine. And now… let’s play “spot the difference.”

Spot the Difference

Let’s see how many differences we can find between the previous fictional conversation with a futuristic AI assistant and a real conversation with a chatbot. Let’s also explore the hidden implications.

ME: “Hey Alex! I need some screws to put a shelf on the wall.

ALEX AI: Did you already buy the shelf?”

Currently, you can go to the Leroy Merlin site or Amazon and search for screws. You will retrieve the first 25 items out of 1300 (300 in the case of Amazon). Then, you can filter them to reach your goal. Here is where the comparison between my fictional model and the actual models ends. In our fictional conversation, the machine knows that screws are typically provided with the purchase of a shelf and, before starting any search, it wants to make sure that you already have them.

ME: “Of course I did!

ALEX AI: Aren’t the screws provided?

ME: They were, but I don’t know where I left them.

ALEX AI: Do you know the name or the ID of the shelf?”

Now the machine knows that the screws are missing and, again, before starting a search for compatible screws, it will try to get the same screws from the same manufacturer. This is how a human would use logic to solve the issue in a face-to-face interaction at Leroy Merlin’s warehouse (or any other brand for that matter). A similar interaction performed by an AI would imply a deep knowledge of product categorization: the exact model of the shelf will provide an exact match for the screws needed.

ME: “Of course not! They have these stupid, unintelligible names!

ALEX AI: Can you tell me the screw number that the shelf is using?

ME: Let me check the instructions… Number 8!

ALEX AI: Will you need anchors?”

Next step: if you do not know the exact name of the shelf and therefore the machine cannot access its specific components, let’s find compatible screws. This opens a new kind of search where the machine looks for specific details. The assumption is that the machine knows there are various types of screws, organized by numbers. Additionally, if you are planning to put a shelf on a wall, you need another equally necessary component called an “anchor.”

ME: “Mmm… Actually, yes, I will.

ALEX AI: Are you planning to put the shelf in your home office?

ME: Yes.

ALEX AI: There are two types of walls there: bricks and plasterboard. Which one will you use to put the shelf on?”

Yet more difficult. Your AI assistant knows that anchors vary depending on the surface they will be used on. Not only that, it knows that your home office has a wall in the internal part of the facade and three others built from plasterboard. So, the details for the search still needed to be refined with follow-up questions before showing the retrieved products.

ME: “You know what? Forget it. I’ll leave the books on the floor.”

This has nothing to do with AI. It’s just another example of us humans being impatient, which is the reason why so many searches are abandoned before getting a proper answer.

Conversations Are Not Easy…

The whole previous interaction is an example of prompt refinement. It’s a search directed by a futuristic AI assistant probably not based on LLMs but, in Gary Marcus’ words:

Some sort of neural symbolic AI that tries to combine the best of the neural networks that are good at learning with the symbolic systems that are good at reasoning and fact-checking.

We are using verbs like “understand” and “know” to refer to the AI processing of our inputs, but in the case of LLMs, they do not “know” or “understand” anything. They just generate words in an order that makes sense to humans, replicating the way humans put words together regardless of whether the actual content is accurate or totally false.

“Meaning” is a layer that only humans can apply to an output. And meaning is what makes an output accurate or totally false.

…Unless You Are a Human

An ideal AI would simulate patterns of interactions based on flowcharts that provide a course of action. Human logic would look something like this:

“IF a screw is needed, obtain the exact name of the product.
IF the exact name of the product is not found, obtain specific details.
List of relevant details for screws: number, head, type of anchor…”

We could create asimilar flowchart for every possible search, from screws to diapers, from cars to insecticides. That is actually what human brains do. The current approach for LLMs, however, is totally different. Their replies have the form of sentences perfectly constructed but can be true or false because they are just not built for fact-checking.

What are the chances for a human to get things right from a conversation (once the learning is completed)? Pretty high. Conversations are imperfect, but they are good enough. What are the chances for an AI to get things right from a conversation (once the learning is completed)? Not that great, as McDonald’s and Air Canada (among others) already know.

The Takeaway

Do you want a non-AI summary? There we go:

Human brains are awesome. And AI will be too. Future tense

Tech has two sides. On one hand, companies want to be profitable and develop amazing features widely adopted. On the other hand, users will adopt the features that better serve their purposes, favouring the ones that provide a better experience. Conversations are the standard for human interactions. Hence, it is easy to envision the success of any experience that gets us closer to conversations.

You might expect that the future of prompting (and any human-machine interaction, for that matter) will be conversational.

Then… Let’s Burn All Those Damned Chatbots

Wait! Don’t! There are plenty of tasks that these chatbots can do very efficiently, relieving us from mechanical, mundane tasks. The important thing is to know what kinds of tasks can be reliably assigned to them and which cannot.

At newspective, we help our clients measure risks and tailor solutions. We aim to inspire with the open possibilities of tech. We strive to explore solutions that are as impressive as they are realistic and to implement them in a scalable way. Let’s take all the available value from LLMs and use it in a functional and reliable manner.

Technology is supposed to serve human interests and leverage human capabilities. And we are here to observe, analyze, and ponder its best use.

This article is available for download in a beautifully formatted PDF, carefully designed for a pleasant reading experience and highly suitable for presentations. Just send us an email to: hellothere@newspective.design

To learn more about newspective, what we do, and who we are, check out: newspective.design

--

--

Gerardo Sanz
Newspective

I am a digital designer. I write, read, draw and sing. I love and scarcely hate.