Is Unoriginal AI Generated Content Our Inevitable Future? The Dangers of Using Half-Baked Generative AI and the Dead Internet

The rise of generative AI like ChatGPT has revolutionized the way we interact with technology, providing us with sophisticated tools for creating and generating content. However, the growing popularity of this technology has also raised concerns about its potential dangers and impact on the internet as we know it. From the proliferation of fake news and misinformation to the displacement of human creativity, generative AI has the potential to fundamentally alter the face of our internet. In this story, I would like to explore the dangers of generative AI and its potential to shape the future of online content. I will examine the impact of generative AI on the online landscape, discussing the risks of this emerging technology, and highlighting some of the steps and techniques that can be taken to mitigate some negative effects.

Photo by JJ Ying on Unsplash

Examples of Generative AI

Generative AI has been making waves in various industries, creating new opportunities for innovation and creativity. From generating text to producing images and videos, generative AI has the potential to transform the way we interact with technology. Here are some current examples of generative AI that highlight its potential to revolutionize different fields:

1.OpenAI’s GPT-3: This is a state-of-the-art language generation model that has gained significant attention for its impressive ability to produce natural language responses to prompts. It has been used to generate articles, essays, and even tweets. With its ability to understand context, language nuances, and generate content in different languages, GPT-3 has the potential to transform the field of natural language processing, opening up new possibilities for content creation and communication. ChatGPT uses GPT 3.5 and offers a chat based interface. For those interested in its limitations, check out the story I wrote about the limitations of ChatGPT.

For those interested in similar chat based approaches, check out the story I wrote about ChatGPT’s upcoming competition.

I should also note that although ChatGPT is primarily text based, with the use of certain techniques, it can also take images as input and output images. This can further amplify the potential generative capabilities of chat based generative AI. If you are interested in how this could work, check out the story I wrote about how a chatbot can expand its modality from mere text to also include images.

2. NVIDIA’s GauGAN: This is a generative AI model that can transform simple sketches into photorealistic images. It uses machine learning algorithms to recognize patterns in images and generates new images based on the input provided. GauGAN has been used in various applications, including design, gaming, and even fashion. It has the potential to revolutionize the way artists and designers create and collaborate, providing them with new tools to bring their ideas to life.

http://gaugan.org/gaugan2/

3. DeepMind’s AlphaFold: This is a generative AI model that can predict the structure of proteins, which is a critical step in drug discovery and disease research. AlphaFold uses deep learning algorithms to analyze and predict the three-dimensional structure of proteins, which can help researchers understand their function and develop new treatments. With its potential to accelerate drug discovery and help cure diseases, AlphaFold has the potential to transform the field of medicine and improve the lives of millions.

4. Image Generators like OpenAI’s Dall-E: DALL-E is a generative AI model developed by OpenAI, which uses deep learning algorithms to generate high-quality images from textual descriptions. The model works by receiving a textual input prompt, which can be anything from a simple sentence to a detailed description of an object or scene. DALL-E then generates a corresponding image that matches the description, using a sophisticated neural network architecture that has been trained on a vast dataset of images and textual descriptions. While DALL-E’s capabilities are still being explored and refined, its potential applications are far-reaching. The model has the potential to transform various industries, from design and art to advertising and e-commerce.

Originality

Generative AI models like ChatGPT have the capability of producing new and unique output, but the quality of that output can vary and may not always be original. In many cases, this can even result in plagiarism. One of the reasons for this is that these models rely heavily on the patterns and structures they have learned from the data they were trained on. As a result, the output generated by these models may be similar to existing text or data that they have encountered during training, leading to unoriginal content.

The output of generative AI models can also be influenced by their training and the data they are exposed to. For example, if a model is trained on a specific type of data, it may be more likely to produce output that is similar to that data, even if it is not technically plagiarism.

Adding to the Noise

The use of generative AI has the potential to create a lot of noise that humans will need to filter out, which can be both time-consuming and frustrating. The output generated by generative AI models can be repetitive, off-topic, misleading, inaccurate, or even harmful, making it difficult for humans to sift through and identify the relevant information.

As generative AI becomes more widely used, the volume of content generated by these models is likely to increase, exacerbating the noise problem. This makes it imperative to develop new tools and strategies for filtering out irrelevant and inaccurate content generated by generative AI.

Photo by @chairulfajar_ on Unsplash

Humans Outnumbered By AI

As generative AI continues to advance, there is a growing concern that human-generated content could easily be outnumbered by AI-generated content in the future. This could have significant implications for the way we consume and interact with information.

One of the main reasons why AI-generated content may outnumber human-generated content is the sheer speed and efficiency with which generative AI models can produce text, images, and videos. Unlike humans, who are limited by factors such as time, resources, and creativity, AI can generate content around the clock, without getting tired or burnt out.

Moreover, as generative AI models become more sophisticated and better at mimicking human creativity and style, they are likely to produce output that is indistinguishable from that created by humans. This means that users may not even be aware that they are consuming content generated by AI, further blurring the line between human and machine-generated content.

As AI-generated content continues to proliferate, there is a risk that it could drown out human-generated content, making it increasingly difficult for individuals and organizations to stand out and be heard. This could have significant implications for everything from marketing and advertising to journalism and social media.

Photo by note thanun on Unsplash

Can We Train Generative AI for Originality?

There are a number of ways to train AI for originality, but what I would suggest is to train a discriminator that is able to determine the originality of output. This would require humans to label output at different levels of originality and quality, and use that to determine how good an output is and how original it is. This would require the discriminator and the generator to have a more up to date understanding of what is original and what is not. How can it know what is original and what has already been done if it is operating on antiquated data?

Photo by Enric Moreu on Unsplash

Copyright

Generative AI models can pose significant challenges when it comes to issues of copyright and stolen work. This technology’s ability to create content that is similar to existing work has the potential to cause copyright infringement, unintentionally or otherwise. Additionally, generative AI models can be trained on copyrighted material, which can further exacerbate the issue.

These challenges raise important questions around ownership, attribution, and the use of copyrighted material. Who owns the copyright to the content generated by a generative AI model? How can we ensure that the use of generative AI models does not infringe on the rights of copyright owners? These are complex issues that require careful consideration.

Can Generative AI have an Opinion or Personality?

Generative AI models can be trained to simulate opinions or personalities, but it is a challenging task that requires careful design and a lot of data. One approach is to train the model on a corpus of text that exhibits a particular opinion or personality trait. The model can then be fine-tuned on a smaller set of data that is specific to the task at hand.

Another approach is to use techniques such as style transfer or sentiment analysis to modify the output of the model to exhibit a particular opinion or personality. For example, a generative AI model that is trained to generate news articles can be modified to produce articles that are biased towards a particular political perspective.

However, it is important to note that these approaches are still relatively nascent and often require significant amounts of data and fine-tuning to produce convincing results. Additionally, there are ethical considerations around using generative AI models to simulate opinions or personalities, as it may be difficult to distinguish between genuine and artificial content.

It should also be noted that generative AI like ChatGPT does have political and racial biases, so in a sense, all AI have a personality. It is simply that these aspects of the underlying model are likely deemphasized using rule, pattern, or AI based filters.

Fighting Fire with Fire

Generative AI can exacerbate disinformation by making it easier to create and spread fake content. One of the key challenges with disinformation is that it can be difficult to distinguish between genuine and fake content, particularly when it is spread rapidly through social media channels.

Generative AI models can be used to create fake images, videos, and text that are difficult to distinguish from genuine content.

AI can be used to fight disinformation by developing algorithms that can distinguish between genuine and fake content. This involves training AI models to recognize patterns and features that are unique to different types of content, including images, videos, and text.

AI models can be used to analyze images and videos to identify signs of manipulation or tampering. This can involve looking for inconsistencies in lighting, shadows, and reflections, as well as analyzing metadata to determine whether the content has been edited or manipulated.

Once these models have been trained, they can be used to analyze new content in real-time, flagging content that is likely to be fake or manipulated. This can help to prevent the spread of disinformation and improve public trust in online content.

Photo by Mark Chan on Unsplash

Search That Takes Originality and Quality into Account

When there is so much noise that is AI generated, it becomes necessary to promote quality content. Here, techniques similar to that which is used to determine originality can be used to further optimize for higher quality and originality when determining what pages to display when using a search engine or a search engine based chat bot.

One way that search engines can prioritize human-generated content over AI-generated content is by implementing ranking algorithms that prioritize originality and quality. This can involve looking for certain characteristics that are commonly associated with high-quality content, such as coherence, relevance, and readability. We already do this to some extent, but utilizing discriminators that can quickly determine whether a page is AI generated and whether it is original can be a good next step if it is not being used already.

For example, search engines could use natural language processing (NLP) algorithms to analyze the content of a webpage and determine its quality based on factors such as grammar, sentence structure, and vocabulary. They could also look for indications of originality, such as unique insights or perspectives, or citations and references to external sources.

Photo by Christian Wiediger on Unsplash

Legislation will be Unable to Keep Up

Technological advancements are rapidly transforming our society, but legislation often struggles to keep up. There are a number of reasons why this is the case, and understanding these reasons is crucial if we want to create a legal system that is capable of effectively regulating new technologies. New innovations are being developed every day, and it can be difficult for lawmakers to keep up with the latest trends and developments. This is especially true given the complex nature of many modern technologies, which can involve highly specialized knowledge and expertise.

Lawmakers are often constrained by political pressures and the need to build consensus across a wide range of stakeholders. This can lead to lengthy debates and compromises that slow down the legislative process and make it difficult to keep pace with rapidly evolving technologies.

In addition, there may be resistance to change from entrenched interests who stand to lose from new technological developments. These interests may lobby against new regulations or seek to shape legislation in ways that benefit their own interests, even if this means that the legislation is not fully effective or does not keep up with the latest technological trends.

Finally, there is a lack of expertise or understanding among lawmakers themselves, who simply do not have the technical knowledge necessary to fully grasp the implications of new technologies. Have you ever seen senators and representatives question CEO’s of tech companies? It’s fairly disappointing how uneducated and behind the times many politicians are. This can make it difficult to craft effective legislation that balances the benefits of new technologies with the potential risks and downsides.

Conclusion

We’re going to be seeing the escalation of an arms race that has already started. Bots have already been utilized to create fake profiles, fake followers, and artificially inflate and modify metadata. The use of AI will only make it more difficult to differentiate between what is real and what is not. We will be unable to count on our government to help as they are currently unable to keep up with changes in technology. The explosion of information with the rise of AI content will only continue to leave them further in the dust.

It will be up to tech companies to properly filter out the noise so that the internet can be consumed by people. However, if the past is anything to go by, they too will be unable to rise to the challenge unless their bottom line is at stake. Just look at how difficult it is for Youtube to handle copyright issues, and how much Meta fumbles with moderation.

At the end of the day, we will be faced with the question: do we want to improve the content that we are consuming at the cost of reduced “productivity”? Furthermore, if the content we are producing is not that great, are we really being more productive, or are we actually wasting time and resources?

Shameless Plug: Do you like hard science fiction and philosophy? Then maybe you might be interested in my novel Dreaming of a Hopeful Death which is now available on Amazon in multiple formats.

--

--

Avinash Saravanan (アビナッシュ・サラバナン)

Experienced Engineer and Computer Scientist from the U.S.. I write about everything. I generally post bi-weekly. https://asarav.github.io/