Artificial Media

2022 will go down in history as the year when Artificial Intelligence was recognized for its creative capabilities.

Nelson Zagalo
6 min readDec 25, 2022

On November 30th OpenAI put an AI assistant, ChatGPT, on the web for free, which in just 5 days got more than one million users, fed by the curiosity to get in touch with its extraordinary conversational capabilities. Amidst the many flaws we found in it — factual errors, data fabrication, overconfidence, or lack of “voice”—, we all had to acknowledge that we had never seen anything like it. It is possible to enter into a dialogue with the assistant, to talk about the broadest range of subjects, and to generate moments of deep empathic sharing built from self-delusion based on the assistant’s naturalness, eloquence, and discursive understanding.

But it wasn’t just the conversation that was taken by storm. Months earlier, on July 12, another assistant, Midjourney, was released; then, on August 22, the University of Munich’s Stable Diffusion was also released, and on September 28, OpenAI made Dall-E 2 available to everyone. These three AI assistants share the same image drawing skills, with the particularity of presenting original results and at the same time very human as if they had been created by human beings.

Conversation and drawing are very close and deeply creative activities. Just as we improvise with a pencil on paper, so we improvise in the creation of responses, in real-time, to interactions with others. So, it is important to understand what gave rise to these new capabilities of machines, considering that AI has been around for almost 70 years.

Until a few years ago, AI was built from mere algorithms, as if we were sculpting a model of thinking, hoping to arrive at something capable of recognizing the world and itself. But AI developments began to change with Deep Blue, which to beat Garry Gasparov in 1997 studied thousands of chess moves beforehand. Since then the approach has become focused on learning, teaching, and self-teaching AI systems, Machine Learning (ML), to which the internet has contributed with access to ever-larger databases. The training process of AI was transformational, not only because it allowed assistants to find similar patterns and thus discover the key to semantics by mere comparison, but mainly because in this process the so-called artificial neural networks are developed, in which each concept or idea is linked to hundreds of other ideas using parameters learned by the AI itself.

Simplistically, when we look at a pen, we can’t establish more than 4 to 6 parameters to define it — e.g. colour: blue; effect: metallic; glow: golden; length: 7cm; width: 0.5cm. An assistant, on the other hand, armed with a set of ML algorithms, can, through comparisons established within these huge databases, find and define not only a few dozen but even hundreds of parameters that define the same pen (see Figure 1). Parameters that we humans may not consciously be able to see, hear, feel, or even understand. But it is these parameters that allow access to an immensely fine reality, probably inaccessible to the human brain, and enable the assistant to predict the best words to write, one after the other, as well as the best colour, stroke, shadow, volume, scale to assume when drawing an image.

Figure 1: Multiple layers of parameters, and their interconnections, in a deep neural network. GPT-3, which is the most advanced system at the moment, consists of 96 layers of artificial neurons, totalling 175 billion interconnections [1].

This process of learning from large databases created by humans presents major problems, ranging from embedded biases to the copyright of its contents. If in text form, bias can become quickly clear, with assistants regurgitating many of the moral biases that pervade the internet, it is in images where most problems have arisen. Assistants are using image databases, such as LAION, created for scientific research purposes. Since the goal was to build the most advanced assistants possible and thus contribute to human innovation, these databases were allowed to catalogue almost everything they could find on the net.

Figure 2: Stable Diffusion “dreaming according to”, from left to right, Gustave Doré, Sebastião Salgado and Loish.

Thus, hundreds of millions of pictures belonging to everyone have been catalogued, from the photo taken by the most ordinary citizen and posted on Flickr, to elaborate photos by photographers, as well as highly detailed illustrations or paintings by artists — which could be found in newspapers, magazines, museums, as in online portfolios DeviantArt, ArtStation, Behance, etc. So if we ask one of these assistants to give us an image following the style of Gustave Doré, Sebastião Salgado or Loish, he has no problem fulfilling our wish, as can be seen in Figure 2.

If this creative ability of AI is remarkable, underlying it arises two inevitable questions: 1) who gave permission to use the images of these authors?; 2) who gave them permission to copy and create versions of these images? The answers we have at the end of 2022 are not good. In the case of Stable Diffusion, despite Professor Björn Ommer saying “we did not go through the Internet and find the images ourselves”, they are actually using LAION. In MidJourney, David Holz assumes that they used images taken from the Internet without any consent, and they have no idea who the images they used are.

Figure 3: Images created by Midjourney.

Despite these problems, which will provide much discussion in the next years and probably new legislation, we can say that in the creative field we have reached a point of no return. Note that creativity is defined by the human being’s ability in “making unfamiliar combinations of familiar ideas” [2], which is exactly what these AI systems offer us, as we can see in figure 3. But this point of no return doesn’t happen only in the domain of text and images, from this “simple” computational process the whole world of media creation is being transformed into a completely new artificial world. Anyone can today through simple prompts ask AI assistants to create stories, news, drawings, titles, paintings, photographs, graphics, music, voices or animations. This new Artificial Media (see Figure 4) can even win festivals and contests [3]. Right now, anyone without any programming knowledge can ask an AI assistant for guidance in the development of a new App and have it create all the necessary code in the programming language they need.

Figure 4 — Artificial media is defined by the AI assistants that enable communication and art creations. For an exhaustive list of AI assistants visit Futurepedia https://www.futurepedia.io.

In this new artificial world, human beings will be less and less the doers. The computer can make, create, mix, and repeat dozens or hundreds of times an hour until it offers us what we are looking for. The human will thus no longer have to worry about the practicalities of art, as well as the material complexities of each medium. Anyone will be able to create anything, which being true will not translate into anyone being able to create what many of us will want to experience. But we have already had this same conversation about Web 2.0 (e.g. 4, 5). In this new world, made by Artificial Media, the human will be an art and communication director, for which he or she will need, not less, but more, much more literacy about the media arts he or she is using.

Notes:

[1] Brown, T. B., Mann, B., Ryder, N., Subbiah, M., et al. (2020). Language models are few-shot learners. arXiv:2005.14165. https://arxiv.org/pdf/2005.14165.pdf.

[2] Boden, M. (2007). Creativity in a nutshell. Think, 5(15), 83–96. doi:10.1017/S147717560000230X.

[3] Glenn Marshall won Best Short Short at the Cannes Short Film Festival with “The Crow” (2022), while Jason Allen, won a digital photography competition with “Théâtre D’opéra Spatial”.

[4] Keen, A. (2007). The Cult of the Amateur: How blogs, MySpace, YouTube and the rest of today’s user-generated media are killing our culture and economy. Hachette UK.

[5] Anderson, C. (2006). The Long Tail: Why the Future of Business is Selling Less of More, Hyperion NY.

Published in Portuguese in Virtual Illusion.

--

--

Nelson Zagalo

Full Professor of Multimedia at University of Aveiro, Portugal.