Language, Reason and Building a Machine that Can Spot Fallacies

Published in

Deepnews.ai

4 min readMar 10, 2021

Looking at the research of Dr. Henry Shevlin at Cambridge University

Editor’s note: This is one of the occasional posts we do speaking to someone with something to say on topics that we find interesting. If you are seeing this and are not yet signed up to Deepnews, sign up here to start receiving our blog posts every week, and a Digest of quality news on an important subject every Friday.

Human language is incredibly complex, with shades of meaning, various contexts, and grammatical structures. People use it to communicate emotions, news and express their own opinions using things like reason.

Language is also one of the rapidly advancing frontiers of artificial intelligence, and one of the realms where people are mostly likely to think of machines as “intelligent” as algorithms are capable of impressive tasks such as generating text (or distinguishing different levels of journalistic quality like Deepnews does).

I have written on this blog about abstract concepts and algorithms, but there is also fascinating research into exploring and isolating even loftier ideas using machine learning, such as elements of reasoning.

This week I spoke with Dr. Henry Shevlin, a research fellow at the Leverhulme Centre for the Future of Intelligence at Cambridge University, who is working on a project that could use machine learning to identify traditional logical fallacies such as non-sequiturs and ad hominem attacks.

Shevlin comes from a philosophy background, and beyond his interest in reasoning, he has done work on cognition and how we understand intelligence in non-humans. The Leverhulme Centre’s Animal-AI Olympics led by Dr. Matt Crosby used tests designed for animals as a means of measuring AIs.

Part of that approach of looking at humans, animals and machines is distinguishing between entities that are good at one task but not much else (such as a computer that plays chess), from things that are more complex and have “general intelligence” like is found in biology.

“Think about something like nematode worms. Nematode worms live everywhere. In some sense they are very good at achieving their goals in a wide range of environments. But that’s because their goals are relatively invariant based on different climates. The world for nematode worms is pretty flat,” Shevlin said.

“I think that’s one reason we don’t contribute a great degree of intelligence to nematode worms. The notion of general intelligence we are probably working with implicitly when we talk about the need to build more intelligent AI systems is something like the ability to achieve a variety of different environmentally sensitive goals, across a wide range of environments.”

Compared to the level of general intelligence in animals, Shevlin says that humans haven’t created any machines that really compare well to cats, or even bees.

One factor to consider, however, is language, which is closely tied to what many people view as intelligence when focusing on humans. I sometimes view language as a bridge that acts between different knowledges as we communicate with each other.

A current project for Shevlin looks at creating a benchmark to test machine learning’s ability to look at reasoning and pinpoint informal fallacies such as equivocations or No True Scotsman arguments after it is fed a dataset that includes them.

“I think back to Microsoft Word 97 grammar checker, which was terrible, but it was still not useless. You would generate a lot of false positives and a lot of false negatives but maybe half the time it was a sentence you wanted to think twice about. Something like a fallacies checker could be a similar bit of functionality. Clippy or the 2021 equivalent that pops up and says, ‘You might be making an ad hominem attack,’” he said.

It is an interesting project, particularly given that the lack of reasoning and logical flow is one of the biggest places of improvement needed in models such as GPT-3. The research also raises big, broad questions about Shevlin’s other work on specific tasks versus general intelligence. If a machine can spot informal reasoning fallacies, does that mean that it has isolated “Reason” or is generally intelligent?

He cautions that large language models such as GPT-3 at this point are very good at finding patterns in language rather than understanding something like reason, and that “fallacies are best understood as a linguistic category.”

Machines are very good at cheating on tests even when they don’t have the core “understanding” or underlying abilities that Shevlin digs into in this piece. That may mean that machines could be good at spotting where reason is (or where reason isn’t) even if you can’t have reason in a machine.

“I’d maybe suggest a certain pessimism about the ability of purely linguistic projects in AI to capture the deep structures of intelligence, insofar as the kind of reasoning we express in language is of a very different kind from the actual reasoning processes by which we optimize our behavior,” Shevlin said.

“I’m inclined to think that’s true, but I have to admit that the emergent capacities of language models have surprised me at every turn.”

Language, Reason and Building a Machine that Can Spot Fallacies

Written by Christopher Brennan