(Sur)realistic generative art with neural networks
And how making neural generative art is different from traditional generative art?
Visual artists are always looking for new forms and adapting the tools created by technological progress for artistic purposes. Generative algorithms, which existed long before computing devices, received a huge development with computers. Since then and before the invention of generative neural networks, the generative algorithm was usually set by manual programming. Generative neural networks find the generation laws based on data, thereby allowing you to find generative algorithms automatically. It creates complex, drawn-like images rather than geometric abstractions of conventional manually programmed generative algorithms. Also, the images generated by neural networks are much more diverse and unpredictable than those obtained by other algorithms. In this article, I will show some of the differences between traditional generative algorithms and neural networks and show the possibilities that the neural approach opens up.
How traditional generative algorithms work
To create a generative work, you need to describe the laws by which the work will be generated. As laws, any algorithms are used that produce a certain, often infinite, number of options, for example:
- drawing of given elements: squares, circles, arbitrary shapes… — with variable parameters: coordinates, sizes, color …;
- formulas that map a value to a coordinate;
- dynamic algorithms that check the specified conditions at each step and change the positions of the elements following them;
- And so on…
The algorithmic art fascinates with the order, the complexity of the sum of simple parts, the game of infinite variations. And even if often, after seeing one example of generation, it becomes clear how the rest will look like, each of them still remains unique.
Why traditional generative algorithms are limited?
The aesthetics of generative art are mostly based on geometric abstractions and patterns. The reason is in the nature of algorithms. Manually written algorithms cannot reflect the variety of forms and accidents of the complex world: for this, you would have to find and program an uncountable number of features.
This approach is used to design the environment of the game world — for creating variants of an object: useful but probably too repetitive and too boring. Even the endless variations of the real world: people, buildings, landscapes that we see every day are ideally unique, but hardly always excite us. To make real generative art interesting, we must not only copy the world but also combine its parts in unexpected ways.
Neural network-based algorithms allow you not to manually set features, formulas, and constraints, but to find them automatically during training and looking through data. Neural networks are capable of generating realistic images while leaving room for unexpected deviations from the familiar.
How generative neural networks work
There are several types of generative networks, but in this article, I will look at generative networks that do not require other images as input. Specifically, I will look at the StyleGAN architecture.
As a basic intuition, a neural network can be represented as a black box, inside which the numbers arriving at the input are multiplied/added to the parameters of the box, converting the input data into other numbers at the output. To train a neural network means to find such parameters that, by feeding certain values to the input, useful processed numbers are obtained at the output. In the case of generative networks, the input data can be random numbers, and the output data can be an image. Other random numbers will correspond to another image.
To generate images, we do not set the network parameters manually, but find them during training: we show the images that we would like to see, and automatically adjust the parameters so that the images at the output of the network are structurally similar to the original ones. The trained network will produce unique images that look like the original but do not duplicate them, and the variety and type of images will depend on the training data.
Making generative art with neural networks
A trained generative neural network is a complex function that maps a set of random numbers to images. A set of random numbers form a multidimensional space, and, unlike the usual three-dimensional, in which length, width, and height are not related to each other and have an understandable meaning, the components of the learned space are difficult to interpret.
For generative art, the main thing is to understand how to move in this space. In its simplest form, we take random points from this space, feed them to the input of the network, and have generated image at the output. The closer the points in space, the more similar the generated images will be.
The most interesting works are obtained when the network is trained well enough, but at the same time can create impossible combinations and artifacts. The network expects to receive random numbers in a certain range, so changing this range may confuse the network and make it generate less accurate images.
Thus, from setting the generation rules, we move on to choosing a topic and selecting data, and after training the network, to exploring the space of random numbers. This generation is like watching an album of an unknown artist obsessed with one idea. Work is not created but found.
With the manipulation of a random vector, the network generates dream-like abstractions, not so much limited to the colors and shapes present in the training data: colors become bright and rich, shapes — vague and arbitrary. The network uses palettes with complementary colors — colors of opposite shades in the RYB palette and harmonious for perception.
Even though neural networks are capable of generating complex and realistic structures, it can be difficult to obtain high-quality and interesting results. There are several reasons for this:
- The difficulty of manually specifying formulas is replaced by finding a large amount of data and processing it to a similar structure. The StyleGANv2-ada architecture will require approximately 1–10 thousand examples, depending on the complexity of the dataset, to obtain satisfactory results.
- Training one network takes approximately from few days to few weeks on top video cards, depending on the dataset size, quality, and resolution of the generated images.
- It is difficult to control the images: standard implementations do not allow, for example, to change some parts of an image without changing the other parts. It is necessary to seriously change the architecture of the network to achieve this. It is also difficult to find significant components of the random space that are responsible for specific output qualities.
It is not practically possible to create such complex images with traditional generative algorithms, as you can create with a neural network. We move much further from developing an algorithm to exploring its capabilities. A neural network becomes a kaleidoscope of the data it was trained on. An artist, like a child, looks at images and chooses from them, like a critic. As a result, the neural network allows you to look into a surreal world, the journey through which is as interesting as its images.