This article was written in Spring 2022 and the world has moved on with Generative AI producing stunning images based on word prompts from a user. Does Generative AI produce art and who/what is the artist?
In 2022 I said that I was the artist and the computer was a tool. Computers could not be artists until they were able to initiate a creative process. For that, I said the computer would need to have self-knowledge. But Generative AI has found a different way.
There is a saying: “there is nothing new under the sun”. Creativity is a process that finds a new way of looking at existing stuff. That’s what Generative AI does. Therefore, Generative AI produces art and it is the artist. The human user is merely a client going through a prompt engineering process to persuade the computer to create something valuable.
That’s fine. But where does that leave the human artist. Well, it is still possible to do interesting stuff with NST and stable diffusion. And there is always paint…
I like to paint. My favourite materials are acrylics. I like the way you can move the paint around the canvas until the magic happens. More about that later.
I like to program. For fifteen years, I was heavily into computer games and we produced many of the award-winning flight sims of that golden era for the PC.
Bringing painting and programming together is a natural for me and I have been using a technique called Neural Style Transfer (NST).
With NST, you take an image, typically a photograph, and impose on it a new style, typically the style of a famous artist. It was made famous by Gatys and others around 2017 and there have been a number of improvements mainly around doing the job quicker.
I have included some examples of NST in action in this article. The image at the top is a sunset on the north coast of Tenerife in the style of Van Gogh. The other images are small samples and take a powerful machine with 12GB of graphical processing a couple of minutes to produce. But bigger images, say 600mm square, would take a couple of hours to produce and that is with the biggest machine I could hire from Amazon this side of the Atlantic last year!
Is it Art? And who is the artist?
In a nutshell, I think it is art and I am the artist. Not the computer. Someday, computers will become artists but not currently. They are tools for an artist just like paint brushes and knives.
Right now, it is possible to get computers to appear to be artists but it is sham. They are just pretending. You might disagree and that is fine. It just depends on your idea of what art is. If you agree with me about that, I think you will agree that computers are not artists, yet.
For me, a piece of art is the result of a creative process that is appreciated by human beings for its emotional impact and perhaps its beauty. So, the beauty of nature is not art because it is not the result of a creative process. Well, I don’t think so anyway. If a computer could initiate a creative process to deliver a thing of beauty or emotional impact, then it could be called an artist.
This will happen. Afterall, as Brian Greene said in a 2021 podcast, ‘a computer is a collection of particles just like us’. I am not so sure about the particles bit but I understand his point.
Crucially, though, computers cannot currently initiate a creative process. They have no inner-self or self-knowledge. There is nothing there to get things going. Programmers can simulate it to an extent but programmers cannot give a computer life!
This will happen. When it does, the computer may not produce art that human beings value for its beauty and emotional impact. Really, it should produce art that moves it and its fellow computers. Anything else would be to enslave it, wouldn’t it? So actually, computers may never produce art when art is defined as something that human beings find beautiful or produces an emotional response in them. Computer art would be for computers. That is very fanciful and perhaps deep. But why shouldn’t a creative computer come with emotional baggage like we do?
NST: the painting process
Currently then, the computer can be a tool for artists to explore their creativity.
With paint, brushes, knives and canvas, I have an idea of what I want to produce and explore it on the canvas. It doesn’t always work but you know when it does. Artists get to know what works for them and as a result they develop a style.
I expect experienced artists explore less than me. But I think everyone starts by this trial-and-error process. In the computer world we would call this an iterative and interactive process: you develop a solution to a problem by trying things out on the computer, with the computer giving you instantaneous feedback.
This iterative and interactive method is the most productive and satisfying way of using computers. Humans do what they are good at and computers do what they are good at.
The alternative of the human providing the data and the computer doing all the work can be dangerous, is certainly not satisfying and provides no learning or development for the people using the process.
We know that using machine learning delivers tremendous benefits to society, for instance in medical diagnostics. But these systems are susceptible to ugly biases. Let’s find ways of putting a human being in the loop. Design systems that are interactive rather than making humans redundant.
NST uses deep learning techniques and, although it isn’t really an example of machine learning, it is a great example of the iterative and interactive process.
Start with an image, say a photograph, where you can see the essence of something powerful in terms of beauty or emotional impact. Something that could communicate, make contact, share a feeling, perhaps tell a story. Feed the computer with different styles, including your own, to try to amplify the essence you see. Then tweak the many variables that go to make the picture and try again. The whole cycle can take just a couple of minutes.
Art is a human emotional response. That is what I meant earlier when I said, “the magic happens”. When I am painting I have an idea about what I want to produce but it is difficult to verbalise it. So I move my paint until the idea appears. It is the same with NST.
NST: the computer process
Over the last few years there have been enormous strides in the branch of artificial intelligence called machine learning. Most of the development has been in what is called deep learning.
Deep learning structures in computers are many layers of artificial neural networks. An artificial neural network attempts to recognise underlying relationships by mimicking the way the human brain works.
Deep learning makes use of the Universal Approximation Theorem which states that given any continuous function no matter how complicated it is, it is always possible to find an artificial neural network (ANN) that can approximate that function to the level you want. There is a great explanation here.
This means, more or less, that there is always a set of equations with loads of variables that will fit your data. But you have to find the value of these variables. The computer does this by guessing, seeing how good the guess is, modifying the guess and trying again. This process is called training and can involve thousands of iterations. It doesn’t always work but with a bit of experience it is possible to set things up so that the computer converges to a useful solution. For instance, with a set of cat and dog photos, you can train a network to tell you whether a previously unseen photo is of a cat or dog. In fact, these days, CNNs are so good, it can tell you what kind of dog or cat it is.
So, to solve a problem you need loads of data. The process involves splitting your data into three parts. Use the first part to train the network to find the underlying relationship. Use the second part of the data to see how well the ANN has learnt the underlying relationship. Tweak the ANN and repeat. When you think you have a good model use the third set of data to test the model. Using the test data ensures that your solution is general and not specific to the training data. This whole process is called machine learning during which you train the network with your data.
With NST, we don’t do any of this. But I do use an ANN called a convolutional neural network (CNN) but I don’t train the network. I use a CNN that has already been pre- trained with thousands of images. This network is deep and holds representation of images at many levels. It is capable of recognising shapes, lines and colours that make up an image and the data at different levels of the network give us different information.
As I said, with the basic version of NST, I don’t train the CNN. Instead I use the trained CNN to optimise the final image. Starting with a random set of pixels, the final image is obtained by iteratively moving closer and closer to an optimum solution. That is all a bit vague but imagine that in our final image each pixel is the closest it can be to both the starting photograph and the style image.
There is one more trick in the process. Gatys found that optimising using the Gram Matrix rather than pixels for the style produces great results. Why? That is complicated and I haven’t seen a good explanation. However when I produced a few example Gram Matrices I saw how they are good at generalising an image and hence a style.
If you want to know more of the detail of the process, there are plenty of Medium articles to read and example code is available on Github.
The point here is that I use the basic pixel optimising process rather than the quicker process where a CNN is actually trained. I use this basic method because it gives me more control so that I can produce the art I want rather than what the computer is limiting me to. There is a drawback though and that is computer memory and speed.
Imagine we want to produce a 20 inch square image at magazine quality of 300 dpi. We would need to handle 36 million pixels. There are three colour channels (red, green and blue), so about 100 million data points. We have to use floating point arithmetic for the data so that is nearly a billion bytes, or a Gigabyte (GB) just to store the picture. The neural network will have many layers of information for each of these pixels, say 30. So we are up to 30GB and then we have the intermediate data. To optimise, that is make the next guess, we measure gradients just like High School maths. Each point has many gradients associated with it and so you can see how the memory is eaten. The biggest computer I could hire from Amazon on this side of the Atlantic is 96GB and I have a problem.
I did try storing the numbers I use in less bytes (reducing the precision of the number) but that didn’t work.
I am going to try super-precision where the computer takes an image and makes it bigger by guessing what the extra pixels should be. Yes it uses machine learning for this.
All this though is getting in the way of what I want to do. Produce art. I said earlier that with NST people tend to use styles of famous artists. I would like to try mixing styles, trying different brush work and using some invented styles to capture and share what I see in the source image. Move the paint round and see the magic appear.