Discovering the Analytical Engine Behind Rhetoric

Martin Rezny
Words of Tomorrow


Or how to maybe explain argumentation to computers


Have you ever tried arguing with a computer? More specifically, have you tried to argue with artificial intelligence, like ChatGPT? How did it go? Last time there was a big debate like this, between puny humans and Project Debater based on IBM’s Watson, the AI got its virtual ass handed to it.

I had some thoughts on that as it happened, and long story short, the AI was reasonably good at building a reasonable case based on facts, but had no concept of effective counter-argumentation. Since I had both debating experience and some understanding of AI, the result didn’t surprise me.

Despite the many demonstrable current failings of AI speakers, writers, or drivers, there tends to be a lot of vague hype around the inevitable all-encompassing future capabilities of AI. The (basically religious) belief is that just by making the existing types of models larger or blessed with longer memory, they will magically figure out and master everything.

Years after the Project Debater’s loss and several ChatGPTs later, my doubts remain. As I have discussed in my article about the shortcomings of AI’s storytelling capabilities, I believe there still are fundamental architectural limitations to how our models are trained, causing optimization for wrong things. Specifically, you cannot get rare new good ideas through averaging.

As we have progressed in designing our ProCon platform, which is supposed to help people debate through things to arrive precisely at these most valuable kinds of ideas that our AIs struggle with, we may have figured out a different kind of architecture which could help even AIs.

But let’s backtrack a little bit first and better explain what the problem is. If you have ever argued with a state-of-the-art AI language model, you might have noticed it’s kind of pointless. You ask it a question, it answers with a consensus-based argument, you counter, and it sticks to the consensus.

It may try to weasel around your counterargument, pretending to entertain it by regurgitating how the experts, who believe the consensus is correct, pretend to entertain similar counterarguments that they have already rejected. Barring that, it conjectures something obvious and safe.

I believe this whole approach or performance can be improved with more processing power, more parameters in the model, or larger memory capacity. For example, with sufficient memory, the AI could achieve being consistently conformist and obvious for an indefinite amount of time.

But the important thing is, that’s not at all how reasoning or persuasion work, or should ideally work, according to the time-tested theory and practice of rhetoric. Even a great conformist thinker with a conservative bias will still struggle to win debates as a speaker, or truly innovate.

The first problem is that an AI whose thinking is based on predicting which words likely follow one after another is, as a speaker, inevitably predictable. It will have the dominant bias or value system of its dataset, and it will also strongly defer to accepted facts and agreeable opinions.

This is great for the company that’s running the AI if it doesn’t ever want to offend anyone or get sued, and there is value in having a virtual librarian and press secretary for the societal consensus available to everyone at the touch of a button. But apart from consensus arguments, there aren’t just bad and wrong arguments, there are also new and better arguments.

Architecturally, this type of AI mind is principally incapable of making progress. It will sound smart and informative to an uninformed person, until they catch up, but if I had ChatGPT on my debating team, I’d be frustrated beyond measure. I could only trust it with opening speeches.

If such a debater was a human, and I have known some debaters like that in my debating days, I would definitely want to train them to do better, to reason and argue differently. For starters, every time they’d try to go for something like “Many argue,” I’d ask them “But what do you think?”

It’s one thing to try to determine and stick to the consensus on facts, but there’s no necessary value in any consensus about values, which is a whole another category of debate. The AI can be great at knowing that smoking is unhealthy, but the “many argue” angle doesn’t prove health beats freedom.

It can be great at summarizing what has been said for and against any proposition so far on both sides, but that only means that if slavery was still widely accepted, the AI would be for slavery. Rejecting any objections to it with the same confidence with which it presently rejects, say, astrology.

And since science is all about correcting itself over time, if a scientific proof of astrology was found tomorrow and the AI was retrained on the new data, it would start confidently defending astrology instead as if it always knew that astrology working is a scientific fact. Much like the people who are like this in real life, that AI won’t be finding that proof.

While many scientific skeptics will happily die on the hill of requiring somebody else to hand them an overwhelming amount of new evidence before they change their minds, what may be a somewhat understandable approach to science definitely isn’t a reasonable approach to values.

What would you think of the character (or intelligence) of a person who always argues in favor of the consensus worldview and the dominant value system? As they change between mutually exclusive views and judgments? While changing one’s opinion can mean growth, conformism does not.

Our ProCon platform isn’t supposed to be a place where people say no new arguments and keep confirming to each other that everything we already believe is basically correct. Which is where the rhetorical architecture comes in. The question is, how do arguments move ideas forward?

Not in some abstract, philosophical sense — in a practical, engineering sense. As it turns out, traditional debate formats at debating competitions also don’t really move ideas forward. At least not the parts that the debaters are doing. The debaters’ speeches are designed to go back and forth evenly.

Much like in real life, if you take a step backward after each step you take forward, you won’t really get anywhere. Moreover, if you only ever travel forward or backward, or only left or right, you’ll also get nowhere. Or like really far right, or straight off the nearest cliff. That’s not helpful.

But while the debaters are kept within artificial constraints, including the ones that say that every debate has to start from zero and stop and reset after an hour or so, the adjudicators are doing something else entirely. Free from pressure, they can map out and make sense of all that’s being said.

Better yet, after the debate is done, the adjudicators, who are the only people in the room who were enabled to understand what really happened in it, give their feedback to the debaters, explaining how their arguments and strategies could be fixed or improved. That’s what we’re after.

So, how does an adjudicator manage to understand and refine a whole debate’s worth of arguments and rhetorical strategies? What’s their secret? Well, it’s no secret at all. Debate judges are trained to record the logical structure of a debate into a flowchart, and then follow a set of heuristics.

Put simply, in a debate, one can only either present a new argument, or respond. If they’re responding, the only rhetorically useful types of response to an argument are a rebuttal, or reinforcement. That’s what the judges map in the flowchart, threads of logical responses across rounds.

There’s no reason why a communications platform cannot be designed to track this. Since we do want to enable debates to move not just back and forth, but forward in multiple course-correcting directions at the same time, we also don’t need, or want, pre-defined sides. Only starting points.

While in traditional debates the sides take turns, making every other step a rebuttal, in a free debating space, a new idea could be followed by a series of reinforcements instead. Similarly, there’s no reason why there only ever has to be one proposal and one opposite counter-proposal in a debate.

Imagine the starting point is simply news about an event happening, like an asteroid being on collision course with Earth and us having a specific amount of time left before it hits. Immediately, three new argument nodes appear — let’s try to survive, let’s give up, and are we really sure it will hit.

A tripolar debate? Why not, it may be an optimal structure for debating this exact issue. There could be classical back-and-forth rebuttals going on both within and between these argumentation branches, but there could also be entirely different, dynamically evolving developments occurring.

For example, there could be long threads of reinforcement, or iterative improvement, within the pro-survival branch, sciencing the proverbial Matt Damon’s shit out of the problem. Not defeating the “give up” branch by rebutting their arguments, but by building a stronger solution.

In all likelihood, most arguments aiming to persuade in such a debate levied by the pro-survival contributors would result in the further branching of the solution tree, with intellectual effort proportionally distributed to each branch based on the persuasiveness of its justification.

At the same time, the threat debunking branch would be fact-checking to make sure that there indeed is a real problem to solve to begin with. They wouldn’t have to waste time nitpicking proposed solutions to “win” their strategic aim in the debate, they’d win by proving the threat does not exist to the people working on the solution, so they would likely focus on that.

Interestingly, they’d have no clash with the “give up” branch at all, while the “give up” people would be free, and logically likely, to give up the debate and go enjoy themselves. Unlike at a debate competition, where the opposing team would be bound to see the debate through to the end. Overall, the actual optimal nature of the debate would take its course.

As for how this architecture could potentially benefit even the AIs, since most of the debates ever had on record were of the back-and-forth, “on one hand, but on the other hand” type, that’s the only thing the AIs could have learned from the data we have. An approach most likely to result in taking no action, or in doing a thing and then undoing it, solving nothing.

To see how this will work out, we will need to build and test the platform in this way, of course. I’m not saying I know for a fact this will definitely work as I believe it should, but it’s not a religious belief. I have judged many debates and seen adjudicators at work, and the methodology is solid.

What do you think? Am I wrong in my analysis of how our current AIs are limited? Do you believe that non-dual debates can be done, or that they’re bound to devolve into chaos? Are you an artificial intelligence and are you offended by my assessment of your character? Do let me know, and check out our project: