Sitemap

Deep Dive into the Amazing World of Generative Artificial Intelligence

4 min readJan 6, 2023

Type some text and get everything you want…

It´s a wonderful world that we are living in right now. It´s not The World of Oz, it’s not Alice’s Wonderland, sure is not The Amazing Spiderman, or not even diving in Avatar: The Way of Water. But we can create our own world by using generative artificial intelligence.

Photo by Kritsada Seekham on Pexels

You simply input some words describing what you want, maybe adding some style (control) and you can get a plethora of different outputs: an essay or a text, music or audio, an image, a video, a 3D image, an animation, a shopping list, and even software code. That’s why this disruptive process is so-called text-to-anything or text-to-everything.

Let´s enumerate some applications by group:

  1. Text-to-text: GPT-3 and ChatGPT from OpenAI, LAMDA 2, and CALM from Google.
  2. Text-to-audio: AudioLM from Google, and AudioGen from Meta.
  3. Text-to-3D: DreamFusion 3 from Google.
  4. Text-to-image: DALL-E 2 from OpenAI, Midjourney from Midjourney, Stable Diffusion from Stability AI, Imagen from Google, Make-A-Scene from META
  5. Text-to-video: Transframer from Deepmind (with a little help from its friend Google), NUWA-infinity from Microsoft, Make-A-Video from META,
  6. Text-to-code: it’s derived from the text-to-text
  7. Text-to-shop: Wallmart

GPT-3 passed the Turing test, as its behavior was very similar to the behavior of a human being, although it´s not perfect, having its pitfalls as it doesn’t manage well common sense acting in an autistic or savant way. Microsoft helped a lot in its development by building a supercomputer to accelerate the learning phase and the model’s deployment. This partnership is a win-win game, as the model of monetization is the use of Azure infrastructure to cope with the millions of users of this helpful application.

On the other hand of the dispute, we have LaMDA 2 (Language Model for Dialogue Applications) as its competitor.

ChatGPT and CALM are chatbots that will compete in the future, as at the time of this publication only ChatGPT was available.

But of course, the input doesn’t have to be limited to text, you can use an image as input and generate as output another image, for instance, like the Tik Tok’s Manga AI filter. You can even generate your own piece of art, based on the classic painters, or just transfer its style to whatever image input you want. As a matter of fact, one of these computer-generated beauties was awarded a prize, which made real artists complain about, raising copyright issues and the like.

So, in the future, you will have the anything-to-anything paradigm. You will be able to choose what is your input and the output you want.

The generative techniques grew so much in a relatively small amount of time, that I’m not afraid to say that it has generated their own space in the artificial intelligence world that can be called Generative Artificial Intelligence.

I may say I’m a dreamer but I’m not the only one.

Deep learning, convolutional neural networks, and generative adversarial networks come to stay and revolutionize the way we create, not to substitute our abilities, but to enhance them. It´s like an assistant, but you always give the final touch. Humans will always be in control of their creations.

Final disclaimer: Google must OpenAI, or better both eyes, and KEEP CALM development and deployment ASAP.

Image by author

I would like to thank my professor Luiz Velho for being so generous in offering a course in Imaging Processing using GANs at IMPA, which opened my mind to this amazing world of generative artificial intelligence.

Futuristic post-script — In the sixties, Arthur C. Clark predicted not only the internet, and the PC (Personal Computer) but also remote work. When the interlocutor asked him about the dependency that could result from this future invention, Arthur pointed out the benefits, including the liberty to work anywhere.

Back to the future, that is, the year of our Lord 2023, I may say that the work from home, if planned with cleverness can be beneficial to society and the environment alike. Governments and companies can alleviate the pressure upon big cities, by promoting people to get back to the countryside. Less pollution, a better quality of life, less time spent in traffic jams, and more liberty for the people.

--

--

Alberto Kopiler
Alberto Kopiler

Written by Alberto Kopiler

0 followers

Researcher, electronic engineer, data, and computer scientist that loves technology, innovation, and predicting the future.

No responses yet