I played 20 questions with ChatGPT to see if it can think.

Can ChatGPT pass the Turing test? Let’s find out!

--

Alan Turing (generated by Midjourney bot)

Alan Turing is known to be the father of AI, and Computer Science in general. He was really intrigued by the fact that AI would eventually become indistinguishable from humans. He developed something called the Turing Test, to figure out whether or not a machine can think. Using the Turing Test — or the “Imitation Game” as Turing liked to call it — we can really put AI to the test. Recent developments in Machine Learning and AI such as ChatGPT and Google Bard have brought this question back to everyone’s mind. Does ChatGPT pass the Turing Test? ChatGPT was able to fool a panel of judges in a controlled environment, but that doesn’t mean that ChatGPT is anywhere near human. This raises another question: what does this mean for the future of AI? Today, we are going to dissect this question, by looking at what the Turing Test is itself, under what conditions did ChatGPT pass the Turing test, and whether or not ChatGPT passes a Turing test of my own.

Before we dive into the ocean of knowledge, I want to say a huge thanks to my friends Rohan and Sam for helping me conduct the actual Turing test.

Although the Turing test sounds like a very specific set of instructions, it’s really just a vague guideline for testing the thinking ability of AI. This is what made my own Turing test extremely difficult. I had to manage different variables and control aspects of the experiment very precisely.

There are 3 groups in the Turing Test — the human judge, the human respondent, and the computer respondent. The judge first asks the same question to the human and the computer. The judge only receives one of the responses, and they have to guess whether the response is from the AI or the human. The test is repeated multiple times, and depending on the results of the judge, it is determined whether the computer can think or not. The 3 groups have to be separated from each other and the whole conversation has to happen over text. Some people see this test as inaccurate, so there are spinoff versions which have been made. One example is the reverse Turing test. This is basically CAPTCHA, where a computer tries to make sure that you are human, and not another computer. There is also the Lovelace Test 2.0, which tests AI’s ability to draw art.

When doing research on how ChatGPT passed the Turing test, I wasn’t able to find much specifics. I did find many people who conducted their own Turing style exams. Some did it in the original way, with a panel of judges, a human respondent, and a machine. Others tried to see if ChatGPT would lie during a Turing style session to try and convince a human that they are also human. This shows that the machine has intelligence, because it understands that it’s in a Turing test, and its objective is to behave as human-like as possible.

I had 3 tests in mind for testing ChatGPT:

The first test is the “10 Questions” test. This is basically your classic Turing test. Rohan had prepared 10 questions beforehand which only a human could answer, so he could ask both me and ChatGPT. Depending on the question, I will either answer with ChatGPT’s response, or my own, and Rohan then has to guess who responded — me, or ChatGPT. I know the title said ’20 questions’, but, let’s chalk that up to creative liberty.

In this test, ChatGPT was able to fool Rohan about 30% of the time. Basically, Rohan guessed wrong 30% of the time. If this was a regular school exam, ChatGPT would have failed pretty horribly, but this test is quite different. This sample size was quite small, and ChatGPT did quite well, so in my eyes, it passed! Obviously, this is somewhat subjective, but the test itself was still objective, and results are results.

The next test is the “How’s Your Day Been?” test. This test is to see if ChatGPT can converse like a human. If we tell ChatGPT that it has to fake being a human, to what lengths will it go? Will it break or will it fool us until the test has ended?

In this test, ChatGPT put up a good fight, but in the end, failed. It told us about its day, and lied about what it ate for breakfast, what it read, and what it did at work. This was really crazy, because neither of us expected for ChatGPT to go this far. Me and Sam kept probing the AI, until it completely fell apart. When we asked it about the specifics of what chapter it was on in the book it was reading, is when it decided to give up and say that it’s not human.

The last but most definitely not least test, is “The Measure of A Man” test. Yes, I did take that title from the Star Trek: Next Generation episode, but it fits this context perfectly. I hope there’s no copyright on that title 😨! Anyways, this test is all about whether or not ChatGPT is actually human. Now, of course it’s not, but can it trick us into thinking so? You might be thinking that these last 2 tests are the same, but there is a key difference. In the last test, we were kind of beating around the bush. We were asking how their day was, what did they do today, what books they read, etc. This test is the real deal. We are directly asking ChatGPT if it is human, if it has feelings, and if it possesses other human qualities.

In this test, ChatGPT completely and utterly failed. No matter the conditions we posed, it would never lie to us about being human. It kept mentioning that it was an AI language model created by OpenAI.

Here is a video of me and Sam doing the actual test on ChatGPT:

A little later, Sam was experimenting with ChatGPT more and got it to actually pass the last test. Here are some pictures of his chat:

In Sam’s version, he got ChatGPT to pass! A lot of the grading and judgment on these tests is subjective, but I would really appreciate you drawing your own conclusions from these fascinating results! Let me know down below!

Thanks for reading today everybody, I hope you learned something new and grew an appreciation for generative AI, just like I did.

--

--

Ankit Durbha
Computer Science: A teenager’s perspective

I created the publication Computer Science: A teenager's perspective with the goal of creating a community of like-minded, technology enthusiastic peers.