Is Cave Art Cave Art, and What’s in the Box

Cave art is a representative and prosperous art form regarding prehistoric art. Chauvet Cave in France, a famous one among the discovered relics till today, is the earliest certified cave art, with a history of 36,000 years. When we talk about the term “cave arts”, most of the time we are talking about the prehistoric ones such as the Chauvet Cave. These prehistoric cave paintings are indeed a sort of the origin and the prototype of the later cave arts or alias the wall paintings. Cave art was considered to be a comprehension of imitations, games, religious events, and other diverse expressions performed by prehistoric humans when they reside in caves as a sanctuary, while its origin may be unknown, but it is clear that the influence lasted to later eras and was reproduced in almost all major civilizations. Magao Cave in Dunhuang, China, is a magnificent piece of this inheritance.

Chauvet Cave wall painting
Perhaps hunting scenes painted on Chauvet Cave’s wall

The Mogao Caves, also known as the Thousand Buddha Grottoes or Caves of the Thousand Buddhas, form a system of 500 temples containing countless art treasures about ancient Chinese Buddhism. Compared with primitive cave art which focuses on recording and imitating hunting and gathering anecdotes, ancient Chinese cave art has expressed more like a religious collection and cultural dissemination. Exactly why ancient people built these grand cave systems is currently unclear, but the aesthetics in them are irrelevant to the span of time, just as the prehistoric predecessors, striking observers’ hearts. Paintings of Mogao Caves were illustrated in a more complicated and resplendent form, rather than the prehistoric arts’ simpleness and abstractness. The fine, simultaneously strong lines outline people and things, and then gorgeous colors are endowed. Combined with the unique texture of the stone walls, the wall paintings of Mogao Caves reveal a unique and quirky beauty. Nowadays, though it’s still mottled even after dedicated restoration while the colors have faded, it to some extent, conversely, more caters to its strange and magnificent characteristics.

Mogao Caves painting of the immortals (Feitian)
Mogao Caves painting of the immortals (Feitian)

My parents were born in Gansu, the province where Mogao Caves are located. So, I used to really have chances to go there when I’m on vacation. I was shocked by the caves’ beauty at the very first sight. That’s also why I here wanted to employ StyleGan2 to generate some “mine ” Mogao Caves arts.

For the data collection, I crawled images from several boards of Mogao Caves theme on Pinteres.com. Since there are still paintings being repaired, and also the camera may cause damage to the paintings, not much data is left on the internet. I integrated the pictures obtained from different sources, after a series of preprocessing and manual selection, deleted low-quality, irrelevant, and duplicate pictures, and finally got a dataset containing 809 relatively decent pictures. However, holistically speaking, the usability of the dataset is still not high due to the difficulties of collecting raw data.

Dataset
The collected dataset

Runway ML, a packaged machine learning GUI application, was used to generate the images. All the 809 images were fed into the StyleGan2, with a number of iterations (epochs) of 3,000, at the primary stage.

The sample images could be seen through the training process. At first, because the model is transfer-learning-ed based on a birds illustration dataset, the styles are gradually transferred to the “birds outputs”. At iteration 450, we can see the birds in the samples were being creepy, but they still can be recognized as birds (maybe, creepy birds). After 1000 epochs, almost no birds’ features remained. In the end, at 3,000 epochs, not even a single portion still resembled the birds. Taking the place of the birds, the output became some unknown representations. For the reason that the F score seemed willing to decrease more, I trained it for 3,000 more epochs. However, the F score never changed at about 1,200 iterations and stopped at 73. Preventing the model from overfitting, I intercepted the model at the checkpoint of 1,000 epochs.

Epochs 450
Samples of epochs 450
Epochs 1080
Samples of epochs 1080
Epochs 3000
Samples of epochs 3000
Epochs 4000
Samples of epochs 4000
Latent space transition

The results were not that satisfying. Compared with those generated by 3,000 epochs, they did not vary too much. This kind of outcome was not unexpected. I already knew that the quality of the dataset was not good, and a direct transfer learning based on the pre-training on birds illustration dataset was not supposed to make the process smoother. The principle of transfer learning is to train on most similar pre-trained models, however, only birds illustration available on Runway ML.

generated
Some of the generated images

In terms of how neural networks function, the images are reckoned to have hierarchical features, from shallow to deeper. Taking the cave art images as examples, the shallow features correspond to the general atmosphere like the texture or hue, the middle layers are responsible for, let’s say, the lines and the strokes, while the deepest features may be the detailed content in the paintings. Or we can also discuss it based on StyleGan2. Similarly, different hierarchies of styles[1] were proposed in StyleGan2: texture/hue, line/stroke, and subtle contents (based on this given situation of cave arts). Seeing from the result, obviously, the network failed to extract the fine style, or we can also say, the high level (deeper) features, which means the generated images were resembling general paintings on the rocky walls rather than diving deeper to which type or exact style of the paintings.

Styles in StyleGan
Hierarchical styles in StyleGan

Synchronicity of NFT

After I generated the images, though they may not be of a fair quality regarding the standard of art, I still tried to mint NFT to one of the outputs and publish it onto hicetnunc.xyz.

Published image
The generated image I chose to publish

Actually, the environmental issues related to digital currency (e.g. Bitcoin) or the digital token (e.g. NFT) have been controversial these years. This could be a real problem that is worth caring about, However, I am inclined to say that the solution should be more than banning or being against these novelties.

For example, in China, most Bitcoin mining companies are settled in Sichuan Province which has overcapacity in renewable energy (no substantiated reference is published due to policy cause). Large geographical gaps and developed water systems make hydropower generate far more energy than people need in daily life. If this excess electricity is not used rationally, it can only be wasted. Bitcoin involves are gradually becoming aware of the high energy consumption brought by the blockchain industry, they are consciously using green energy or excess renewable electricity as I mentioned above even if they do it for reducing cost. So, in other words, when we try to compare the environmental cost of digital tokens such as bitcoin and NFT with the carbon emissions of a city, a country, or even the earth, we must not be so much one-sided, but should dialectically look at factors such as what kind of energy these digital currencies use and their scale and industry value.

DeepDream
The strong embodiment of high-level features

Correspondingly, people should pay more attention to the benefits of NFT rather than blindly blaming it. NFT makes the manifestation of the value of art be broadened again, which complements GAN or AI arts to some extent. Let’s firstly talk about AI. AI technology makes art get more possibilities from who creates it to how it is created, from what humans can do to what humans have never thought to do. The operation inside the neural network is like a black box. For a long time, people don’t know what’s going on in it but only be aware that the output is satisfactory. Even now, the research (i.e. Google’s DeepDream[2]) focusing on a concrete effect of each layer in the neural network is still young. In the early stage of research. Let’s take GAN as an example. In the process of extracting training set features, GAN often unexpectedly shows weird features. These features are black-boxed during our thinking, just like neural networks. People almost never understand what has gone on from seeing a certain art thing to understanding it. This innovation is undoubtedly expanding the possible combinations of art. Also, as for NFT, in the social and market dimensions, artworks minted NFTs are no longer simply traditional art forms like paintings. Any creation now prevailing on the Internet can be given value by the novel mean. It can be a meme or even a picture you generated with GAN. Though some ethical controversies are inevitable sometimes, at this stage, simply using GANs to generate pictures without skills required may not be truly regarded as art, but as long as there will be time given to let develop, AI art is definitely possible to systematically grow up to a unique aesthetics and become an existence that keeps pace with other major art forms.

And if you are not willing to believe, just refer to the photography of the 19th century.

References

[1] Karras, Tero, et al. “A Style-Based Generator Architecture for Generative Adversarial Networks.” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, doi:10.1109/cvpr.2019.00453.

[2] “Inceptionism: Going Deeper into Neural Networks.” Google AI Blog, 17 June 2015, ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html.

--

--