The Shape of Art History in the Eyes of the Machine
(The blog is co-authored with Professor Marian Mazzone, Art History, College of Charleston)
Several studies have shown the ability of the machine to learn and predict style categories, such as Renaissance, Baroque, Impressionism, etc., from images of paintings. This implies that the machine can learn an internal representation encoding discriminative features through its visual analysis. However, such a representation is not necessarily interpretable by humans. How does the machine classify styles in art? And how does it relate to art historian’s methods for analyzing style?
At Rutgers’ Art and Artificial Intelligence Laboratory, in collaboration with the College of Charleston, we studied this problem. Our study’s emphasis is on understanding how the machine achieves classification of style, what internal representation it uses to achieve this task, and how that representation is related to art history methodologies for identifying styles.
To achieve such understanding, we utilized one of the key formulations of style pattern and style change in art history, the theory of Heinrich Wölfflin (1846–1945). Wölfflin’s comparative approach to formal analysis has become a standard method of art history pedagogy. Wölfflin chose to separate form analysis from discussions of subject matter and expression, focusing on the “visual schema” of the works, and how the “visible world crystallized for the eye in certain forms”. Wölfflin identified pairs of works of art to demonstrate style differences through comparison and contrast exercises that focused on key principles or features. Wölfflin used his method to differentiate the Renaissance from the Baroque style through five key visual principles: linear/painterly, planar/recessional, closed form/open form, multiplicity/unity, absolute clarity/relative clarity. Wölfflin posited that form change has some pattern of differentiation, such that style types and changes can only come into being in certain sequences. Wölfflin’s theory was chosen because of its emphasis on formal, discriminative features and the compare/contrast logic of his system, qualities that make it conducive to machine learning. Today, art historians use a wide variety of methods that are not only focused on form, but for the type of analysis of this study Wölfflin’s approach is useful.
Deep convolutional neural networks have recently played a transformative role in advancing artificial intelligence. We evaluated a large number of state-of-the-art deep convolutional neural network models, and variants of them, trained to classify styles. We focused on increasing the interpretability of the learned presentation by forcing the machine to achieve classification with a reduced number of variables without sacrificing classification accuracy. We then analyzed the achieved representations through linear and nonlinear dimensionality reduction of the activation space, visualization, and correlation analysis with time and with Wölfflin’s pairs. We used a collection of almost 80K digitized paintings to train, validate and test the models.
One of the main findings of our study is that the machine encoded art history in a smooth chronology, without being given any notion of time. The machine was trained to predict styles, based only on noisy discrete style labels, with no information provided about when each painting was created, when each style took place, which artist created which painting, nor how styles are related (such as style x is similar to style y, or came after or before style z). Despite the lack of all this information, the learned representations are clearly temporally smooth and reflect high level of correlation with time. For example, we can see in the plot above that the images are arranged in the plot in a radial clock-wise way around the center to make a complete circle in this 2D projection starting with Renaissance and ending with Abstract Art. We can see the progress following the plot in a clock-wise way from Italian and Northern Renaissance at the bottom, to Baroque, to Neo-classicism, Romanticism, reaching to Impressionism at the top followed by Post impressionism, Expressionism and Cubism. The loop completes with Abstract and Pop Art.
Another interesting finding, which explains the closed loop we just saw, is that the learned representation can be explained based on a handful of factors. The first two modes of variations are aligned with the concepts of linear vs. painterly and planer vs. recessional suggested by Heinrich Wölfflin, and quantitatively explain most of the variance in art history, where temporal progression correlates radially across these modes. We can clearly see the smooth transition from linear form in Renaissance at the bottom towards more painterly form in Baroque to the extreme case of painterly at Impressionism at the top. Then we can see the transition back to linear form in abstract and Pop art styles. Projecting the data into these two dominant modes of variations, which are aligned with plane vs. recession and linear vs. painterly, gives an explanation to why this representation correlates with time in a radial fashion.
Visualizing the different representations shows that certain artists were consistently picked by the machine as the distinctive representatives of their styles, as they were the extreme points along the dimensions aligned with each style. This is visible in the first three modes of variations of the representation learned by VGGNet. We can see the Northern Renaissance in the yellow ellipse with the majority of the paintings sticking out being by Van Eyck and Albrecht Dürer. The Baroque in the black ellipse is represented by Rubens, Rembrandt, and Velázquez. The orange ellipse is Impressionism and at its base are Pissarro, Caillebotte, and Manet as the least painterly of the type, ending with Monet and Renoir as most painterly on the end of the spike. The two red circles are Post-Impressionism, and in particular one is dominated by Van Gogh, and the other by Cézanne who forms the base for the spike of Cubism in the light blue ellipse. This spike is dominated by Picasso, Braque, and Gris; and goes out to the most abstract Cubist works. Most interestingly the representation separates Rousseau, as marked in the green ellipse, which is mainly dominated by his work.
The learned representations by the machine also highlighted interesting connections. Most notably, as can be seen in Figure Z, Cézanne’s work acting as a bridge between Impressionism at one side and Cubism and Abstract at the other side. Art historians consider Cézanne to be a key figure in the style transition towards Cubism and the development of abstraction in the 20th century art. This bridge of Cézanne’s painting in the learned representation is quite interesting because that is a quantifiable connection in the data, not just a metaphorical term. We can see branching at Post-Impressionism where Cézanne’s work clearly separates from the other Post-Impressionist and expressionist works towards the top. This branch continues to evolve until it connects to early Cubist works by Picasso and Braque, as well as abstract works by Kandinsky.
Another interesting connection is the link between the Renaissance and modern art as captured by the learned representation. Despite the fact that the structure reflects smooth temporal progression, it is interesting to see outlier to this progression. In particular there are some High Renaissance, Northern Renaissance and Mannerist paintings that stick out of the Renaissance cluster to the left and connect to art from late 19th and early 20th centuries. This is because frequent similarity between art works across time resulted in pulling influential works of art out of order and placing them closer to the art they may have influenced. We can see in the figure that the works that stick out of the Renaissance cluster at the left and connect to modernity are mainly dominated by some paintings by El-Greco and some paintings by Dürer. Among the paintings by El-Greco that significantly stick out are Laocoön, Saint Ildefonso, View of Toledo, and Pietà. We can also see works by Raphael, Mantegna, and Michelangelo in this group as well.
The results of this study highlight the potential role that data science and machine learning can play in the domain of art history by approaching art history as a predictive science to discover fundamental patterns and trends not necessarily apparent to the individual human eye. The study also highlights the usefulness of re-visiting the formal methods in art history pioneered by art historians such as Wölfflin, in the age of data science using tools from computer vision and machine learning. Finally, the study offers insights into the characteristics and functions of style for art historians, confirming existing knowledge in an empirical way, and providing machine-produced patterns and connections for further exploration. The results also show that style, which appears to be a subjective issue, can be computationally modeled with objective means.
The work is published in a paper titled “The Shape of Art History in the Eyes of the Machine”, which appeared in the 32nd AAAI conference on Artificial Intelligence, held in New Orleans in February 2–7, 2018. The paper can be accessed at https://arxiv.org/abs/1801.07729