Looking at Sarjeant Gallery’s collection through robot eyes
We’ve launched an online collection site for Sarjeant Gallery Te Whare o Rehua Whanganui. It’s the result of a few months of experimentation and collaboration with the Gallery team. This first phase includes options to explore the collection by colour and image orientation, automated subject tagging using Google Vision API, basic statistics on the breakdown of the collection, and some natural language text which we build up from the raw collection management data and then display on the artwork detail page.
We’ve been working on a new website to showcase the collection of the Sarjeant Gallery in Whanganui, New Zealand. The Gallery’s collection consists of over 8,000 artworks and archival items spanning four centuries of European and New Zealand art history. The collection contains works in all media, including painting by old masters and contemporary artists, drawings, photographs, sculptures and ceramics.
Until now only a small portion of the collection has been available online, with the goal for this project being to publish almost the entire collection. A major challenge was to see what was possible without spending significant time reworking the existing cataloguing records and images. Sarjeant Gallery has a small team, so any steps that required revisiting large sections of the collection data were going to be unrealistic.
The Gallery wanted the site to be easy to explore without assuming any prior knowledge of the collection. How could the collection be presented so that general visitors could make the most use of it? With the new website the Gallery had three broad aims: introduce innovative features, make the site as accessible as possible, and meet current technical best practice.
The project wouldn’t have been possible without the enthusiasm and flexibility of the Gallery staff. The Sarjeant Gallery were open to prototyping and experimentation and worked closely with Vernon Systems (particularly our web developer Shinoy Sam) to see what might be possible with their collection data. Automated analysis of the collection images provided us with a range of options.
The site is built on top of our online collection application, Vernon Browser, so we started initially with a working wireframe — unbranded templates with no customisation. This allowed us to get the data on a test site right from the beginning of the project. We then went through many iterations of exporting and exploring the data to decide what content was the most usable. We also decided where the Gallery’s limited time could be directed to for improving the data.
For example, we found that by adding nationalities to the 40 countries of birth recorded in the system we were able to display nationality as a search option for the majority of the artists. We also selected the artists whose works represented the majority of the collection’s works (approximately 100 artists) and added links to the Wikipedia and Te Ara (the encyclopedia of New Zealand) pages about these artists.
Explore by colour
Extracting the dominant colours out of the images was straight-forward. However, building navigation based on the colours proved much more difficult than we expected. The challenge was there were 16 million possible colours, so any precise colour detected in an image was rarely present in any other artwork in the collection. We looked at reducing this to a basic 16 colour palette (like you see as a search filter on Flickr), but after testing we settled on the 140 named CSS colours supported by modern browsers. Magenta and fuschia are the same colour code, so our final palette has 139 colours.
We used Sven Woltmann’s Java port of Color Thief, an open source colour extractor, to grab up to five of the most dominant colours in the primary image for the artwork. We then wrote our own decision tree to find the best match from our 139 named colours for each raw colour detected. The palette size is a trade-off: a large palette provides closer matches to the original colours in the images, but a smaller palette allows more images to be grouped together.
We display the CSS colour names on the site and the colour name and colour groups (red, green, etc.) are indexed as part of the text for the artwork.
Filter by image orientation
One element of metadata we already had was the orientation of the image. i.e. Was it landscape, portrait or square? We’ve added this as one of the filter options and index the orientation as a text keyword.
Subject tags from Google Vision
Automated image recognition is moving forward quickly. Cogapp have an excellent site presenting their trials of three of the main tagging API: Clarifai, Microsoft Azure Computer Vision API, and Google Cloud Vision API. From using Cogapp’s test and our own internal testing we settled on the Google Cloud Vision API. At this stage we’re only using the subject tagging feature, but the API provides other features such as detecting text within an image.
The original plan was to make the tags a private field to aid the curators when they create artwork sets to share with the public. However, we’ve been surprised how good the results are. The tags aren’t always perfect, but the new connections between the artworks are almost always interesting.
Many of the works in the Gallery’s collection do not have subject descriptions. Adding automated subject tags has provided a rich range of new terms to find related artworks. “Cattle on a Beach” is a typical example. The tags are mostly correct and now enable us to jump to all of the related artworks that depict “pasture” or “herd”. However, we can see that the work is incorrectly tagged as “goat” and “goats”. We’ve decided that the additional navigation options the tags provide are still worthwhile, but we’ll be adding an option for the curators to disable specific tags if they want to permanently hide individual dubious tags.
Once you combine the text keywords provided by subject tags, named colours and image orientation, the site provides new opportunities for exploring the collection. Need a set of bearded men for Movember? How about pictures of a house with the colour white? We want users to be engrossed in their browsing of the collection, seredipitously discovering the connections between the works.
An online collection as a microsite
We looked at the pros and cons of building a microsite for the online collection vs an online collection fully integrated into the main site. A microsite is independent from the main organisation website, so can be built in different technology and have it’s own navigation options, layout and branding. We’ve gone for a microsite as this has allowed us to base the site around an existing working product, and to limit the amount of customisation required. In total we’ve spent around seven days altering the branding and layout of Vernon Browser for this project.
I’ll write a follow up post about accessibility, but I’ll summarise the key considerations. The site has been tested for compliance with web accessibility standards; responsive templates provide layouts for mobile, tablet and desktop screens; images that are out of copyright are marked to allow for re-use by visitors; buttons are provided for sharing via social media; metadata on the pages improves the search engine optimisation; the search results page has clear options for results filtering; and we provide links to navigate easily from each record to related information. We have achieved a site where there are no dead ends.
The website is built on top of an API to provide potential options in the future for data re-use and new interfaces. The API will be used to share the records with New Zealand’s cultural heritage aggregator — DigitalNZ.
We’re closely following the use of the site, primarily through Google Analytics and AddThis. We’re particularly interested in what routes the visitors take on the site, what features get the most use, and what content is being shared.
The site is now in a high care phase where we’re making small corrections and enhancements as things are found on the live site. Sarjeant Gallery is continuing to work through copyright clearance from the rights holders, so more images are being continuously added. We are also investigating generating graphs of acquisitions over decades to show how the collection has grown over time.
Lastly, the Gallery staff will be using the new subject, colour and orientation keywords to help them create interesting sets and stories to share on the site.
We hope you enjoy using the site as much as we’ve enjoyed working on it.