How I designed Hidden Art Extraction Tool with Siamese Networks | Part4. GSoC ‘24
Following my previous posts about the ‘Art Extract’ project for Google Summer of Code (GSoC) 2024, this is the final part of how I designed a hidden art extraction tool inspired by a Siamese Network. In this stage of the project, which focuses on uncovering hidden paintings beneath the canvas using multispectral images, I’ll share my architectural decisions and outline potential areas for future improvement.
Ideation
Given two sets of inputs , multispectral images and the ground truth (GT) image of the painting, how would you design a Hidden Art Extraction model?
The first step I took was to redefine the model’s purpose in technical terms. “Hidden” art refers to unseen, unrecognizable artwork beneath the GT image, which is why we use multispectral images to reveal clues about what lies underneath.
Hidden Art = Dissimilarity between multispectral and the GT image
= ∣ GT image−Multispectral Image ∣
In other words, calculating the difference between these two inputs serves as a solid foundation for uncovering hidden art. The task of comparing dissimilarity between two images is well-known in the field of image verification, where the goal is to determine if two images are the same or different.
A Siamese Network, also known as a twin network, is one of the most popular neural architectures for this task. The structure is quite simple: you input two images into identical convolutional layers, flatten the output vectors, and then compare the dissimilarity using distance metrics (such as L1, L2, or cosine similarity). The network outputs a similarity score (or probability), depending on the activation function used.
But how do we modify the Siamese Network to uncover hidden paintings? There were two key challenges I had to address. First, we need to compare a single GT image with eight multispectral images, transforming the task into a 1:8 comparison. Second, instead of outputting a numerical difference, the result should be an image.
Methodology
To address these challenges, I approached it in the following way. First, I generated a dissimilarity map by iteratively calculating the difference between the ground truth and each multispectral image. Then, I normalized each dissimilarity map to ensure fair comparison and applied thresholding to ignore the extreme or meaningless difference. After the normalization and thresholding processes, I created a final collage by selecting the pixel with the highest remaining dissimilarity value from each map, constructing an image that highlights the hidden art.
Currently, I’ve designed a simple sequence of convolutional layers with a residual connection inspired by ResNet, aimed at enhancing the discovery of hidden details. This allows the multispectral images to pass through additional feature extraction layers. Depending on the complexity of the images, you can adjust the depth or intricacy of the convolutional layers for better results.
For easier analysis by users, such as art historians, the output is visualised as above. Areas with the greatest difference between the two input images can be highlighted in blue, offering clear insights into the hidden layers of the artwork.
Possible Improvements
As demonstrated, the tool successfully highlights areas of interest, enabling users to make assumptions about hidden figures. However, there are some potential improvements to consider for future iterations.
1. Optimization
If we had a dataset of hidden images corresponding to the ground truth (GT) dataset, we could optimize the tool using backpropagation. This would allow us to find optimal weights, not only maximizing dissimilarity but also focusing on capturing the hidden figure itself. In the current state, without a “hidden painting answer key,” optimizing the model only leads to highlighting areas of dissimilarity, often resulting in a saturated blue output.
2. User-Friendly Guidance
The tool could become more practical and user-friendly by incorporating feedback from targeted users, such as art historians. By interviewing art restoration experts, we could better understand the visual factors and processes involved in uncovering hidden paintings. This feedback would help us refine the tool and make it more valuable for art research and reconstruction efforts.
If you have any interesting ideas to share about the better approach, please feel free to leave comments below. Visit this GitHub link to find out more about the project!