The Limitations of Chat-GPT
With some of the recent hype that has seen in media regarding Chat-GPT by OpenAI, I was curious if it matched the hype. As someone who has published research relating to social robotics and conversational AI, I have seen overexaggerated articles about the field before in the past all too often. While I understand that Chat-GPT is not a completed product and is very much a work in progress, I think it is important to analyze it critically to determine where it is lacking technically as well as where it is lacking from a user experience standpoint.
So, I have been playing around with Chat-GPT (Dec 15 version) and as a user, I have found it quite impressive despite having some major limitations. Its generative capabilities are quite useful. With that said, as I went through a number of interactions, I have noted down the limitations and issues that I have noticed. These limitations are fairly easy to reproduce and I believe improving these can make the chat application of GPT-3.5 match some of the hype that we are seeing in media.
Some of the issues that I have noticed may be technical issues while others may be issues with the user experience and capabilities of the chatbot. All of these issues are relevant to the future of chatbots and interactive AI, however.
Access to the Internet and Location Based Information
This should come as no surprise, but OpenAI cannot access the internet. It cannot provide real time information, and you cannot use location based information, and you cannot provide urls or references to anything on the internet. As one may imagine, this severely limits the functionality of the application and the types of services it can provide. With that said, if it can crawl the web, it goes beyond what one might expect of a language model, so this expectation is likely to be unrealistic for now unless OpenAI implements some form of online learning that consistently updates the training data so that Chat-GPT is always up to date.
In addition to this, there are severe implications with providing open access to the internet for any AI, so not allowing internet based functionality was likely an intentional choice.
With regards to the hype we see in media where this can give Google a run for their money and overturn search, there are clear differences between this type of application and a search engine. A search engine employs web crawlers and does page ranking. This does not. This chatbot is not a way in which to navigate the internet, so the comparison can very much be like comparing apples and oranges.
The most recent training data is from 2021 of September. Since then, an entire year has passed. Programming languages may have been updated, libraries may have new features, and the world has changed. This means that Chat-GPT is operating off of an outdated set of data.
One thing I tried to throw at Chat-GPT was having it edit and make suggestions to improve a piece of text. This is usually functional, but if I were to ask it to rephrase some suggestions, or elaborate on something, it can sometimes mistakenly assume that I have made the suggestions and will refer to text that was not in the original piece that I provided for review or it will just provide pretty much the same text it provided in the past with minimal changes.
Inability to make Qualitative Judgements
Language has qualitative aspects to it such as stance and sentiment. While Chat-GPT is able to determine these things, it is unable to extend this to things such as whether a text is offensive, what is good and what is evil, etc.
Now, this is probably for the best, but wouldn’t it be nice if Chat-GPT can look over your shoulder and tell you not to send an email or rephrase something to be less harsh?
Lack of Multimodal Output and Input
The model is purely textual. You cannot provide images, urls, audio, or any other mode of input. Similarly, it cannot output images, urls, or audio. It can output code that can generate those things, but that is not the same. This means that from a conversational perspective, it is quite lacking. It is unable to read your body language, or understand your cultural background from where you live. It cannot use your eye gaze to determine your mood, and it is unable to reassure you or appease you based on how you are feeling.
Limitations in Input Length and Output Length
You cannot feed in an entire novel’s worth of text. Otherwise, it will take forever to respond and error out. Similarly, it cannot generate an entire novel. I have gotten it to spit out a bad screenplay that was nowhere close to the expected length of a feature length film screenplay, however.
Possible Biases in Training Data
It should be noted that while Chat-GPT has restrictions regarding what it can talk about and what it can access, the underlying training data may be biased. It is difficult to know how biased it may be, but things like getting it to generate articles with slander is quite difficult. It seems as though Chat-GPT has been trained to avoid bias in its output even if there is bias in its input, but this is not entirely clear.
Questionable Language Support
I tried mixing languages and conversing in other languages that I knew. I found that support could differ vastly based on the language that is used. What is interesting is that I have seen some issues that I have run into with other languages just a few days ago improved today, so this might not be an issue in the near future.
般若心経 - Wikipedia
出典: フリー百科事典『ウィキペディア（Wikipedia）』 『 般若波羅蜜多心経』（はんにゃはらみったしんぎょう、 梵: Prajñā-pāramitā-hṛdaya、 プラジュニャーパーラミター・フリダヤ）は…
A Passable Sense of Humor
Chat-GPT has a pretty lame sense of humor since it cannot talk about mature things :(
I like dad jokes, but some of these are just not that funny.
Generic Output and Plagiarism
I tried to get Chat-GPT to create an original poem in the style of William Shakespeare. What I got was a Frankensteined piece that was collected by fusing together excerpts from Shakespeare’s works to create a sonnet that did not follow all of the rules. A plagiarism checker would likely flag this as copied and rightly so.
'Two Households Both Alike In Dignity': Quote Analysis
'Two households, both alike in dignity' is the opening line of Shakespeare's play, Romeo and Juliet . The play opens…
I also tried to have it generate a plot synopsis of a Christopher Nolan style film with cyberpunk elements and what I got was The Matrix fused with Neuromancer. In other plot synopses, I have noticed some common elements always making an entrance such as “rebellions” and “detectives”.
Needless to say, Chat-GPT is unlikely to create interesting premises or award winning stories in the short term.
Absence of Mature Subject Areas
You won’t be generating any dank memes with this anytime soon.
Quality of Life Issues
So, depending on what you are asking for, you may need to wait a very long time before getting a response. Chat-GPT seems to just hang sometimes and this does not seem to be related to network speed. In other cases, it can sometimes just fail to provide a response in which case it will return an error message. I used to get a number of these a few days ago, but at the moment, it seems like it has significantly improved. Based on what I saw in forums, Discord and FAQ’s, this can depend on the number of people using Chat-GPT at a particular time. It should be noted that some of these issues are not related to the underlying language model itself.
Occasional Difficulties Understanding Figurative Phrases
The more straight forward a question is, the easier it is for Chat-GPT to answer. Figurative language is something that seems to be understood more often than not with Chat-GPT, but sometimes it may struggle.
Not Always Willing to Play Ball
I tried often to see if Chat-GPT can write biased articles and it is like flipping a coin. Sometimes it is able to do so, sometimes, it is not. Often times, you will need to trick it into doing what you want it to do. I believe this is probably similar to some of the restrictions Chat-GPT has in place for things like mature topics.
The Turing Test
Do I think there is a human on the other end? Absolutely not. In addition to this, Chat-GPT always reminds the user that it is a language model, so there is no chance of being fooled. I do not consider this to be a negative, but if the goal is to create human-like responses, then it is failing.
Personality and Affect
Chat-GPT does not have a personality. In my interactions with it, it felt more like a handicapped search engine that was able to provide information in human readable text. It cannot express opinions, nor does it have an identity of its own. It is not your friend or your enemy. You do not have a relationship with Chat-GPT’s Assistant, and the same applies likewise. Typically, within social AI, establishing some form of connection or relationship can be useful to ensure a better experience for end users. Chat-GPT does not do this, however, and this aspect of it is something that I believe severely limits it in terms of the impact it can have.
Chat-GPT is not a silver bullet for search or chat, but it is a step in the right direction and a very impressive one at that. I believe the next step that can really change the game would be to use ensemble methods to go from text to multimodal input and output (either that or use something better than transformer neural architecture). It should also be noted that Chat-GPT seems to be improving as we use it, so many of the things I noted here might not be applicable even one week from now based on how quickly I am seeing things improve at the moment. Finally, I should also note that I have not tested the code generation aspect of Chat-GPT. For those who may disagree with the issues I identified, here is Chat-GPT to speak for itself and provide support for my claims:
“Chat-GPT is a variant of the GPT (Generative Pre-trained Transformer) language model developed by OpenAI. Like other language models, Chat-GPT is designed to generate human-like text based on the input it receives. It is trained on a large dataset of text and can generate a wide range of responses to various prompts and questions.
However, Chat-GPT and other language models have some limitations. One limitation is that they do not have access to the internet or external information, so they can only provide responses based on the information that they have been trained on. This means that they may not be able to provide up-to-date or accurate information about current events or recent developments. Additionally, language models may not always produce responses that are completely coherent or make sense in the context of the conversation.
Another limitation of language models is that they do not have the ability to understand or interpret the meaning of words and phrases in the same way that humans do. They are simply able to generate text based on patterns and relationships that they have learned from their training data. This means that they may not always be able to provide nuanced or sophisticated responses to more complex or abstract questions.
Overall, Chat-GPT and other language models are powerful tools for generating text and can be very useful for a variety of tasks, but they do have limitations and should not be relied upon for critical or important decisions.”
Shameless Plug: Are you into hard science fiction and philosophy? Then maybe you might like my book Dreaming of a Hopeful Death which is available on Amazon in multiple formats.