Using the Hive Mind: WikiData Integration and Artist Pages

Terry Gould
nationalgalleries-digital
5 min readMay 29, 2018

It’s now been almost nine months since the launch of artist pages on the new National Galleries of Scotland website. In that time, they have proven popular with users, with approximately 20% of visitors checking out an artist page, using it as a route to explore the artworks in our collections.

Latest Version of the Artist Browser (https://nationalgalleries.org/art-and-artists/artists)

With the new incarnation of the website, a key objective for us was to enrich the information that was available to the user. Previously, artist information was incorporated into our artwork record pages. This design meant that the amount of space for this type of content was at a premium, so biographical information tended to be pretty short text summaries, of about 100–200 words.

An artwork page on the old nationalgalleries.org, with the biography box showing on the right

With the new site design, we have a lot more room to play with and can present much more information to the user than we could before. Some of the content you can now find on artist pages includes:

* Highlight artworks by that artist in our collection.
* Related Features, blogs and long-form content.
* Media, including relevant audio and video.
* A record of past exhibitions where the artist has featured (this currently goes back to 2009).

As well as this internal information, you may see an option on some pages to view biographical information from two external sources — Wikipedia and the Getty’s Union List of Artist Names. This information is pulled in to our site dynamically, thanks to the magic of Linked Open Data. This use of information outside of the organisation’s control marked a drastic change in thought for the galleries, and there was a lot to consider before we decided to go down this particular path.

From Producer to Consumer — Motivations

One major issue with the previous site was the pre-requisites for publishing artworks and information — which included the need to have a bespoke artist biography written from within the organisation. This created a natural ‘bottleneck’ for the number of items we were able to publish and creating a lengthy backlog of biographies to be researched, written, fact-checked and proofread. This backlog would only continue to grow as our 2D digitisation project rapidly expanded the number of artworks available on the new website (at last count there are about 15,000 people registered as artists, or ‘makers’, across our databases).

This problem was a crucial factor in the initial decision to change our practice of holding back records with reduced interpretation. We wanted our users to have access to as much of the collection as possible, rather than a restricted set of curated artworks. While we still have this well-curated set, it now acts as the tip of a digital iceberg — which people can now dive into and explore Scotland’s art by themselves.

That being said, we knew it was still essential to be able to provide some information around the broader collection to signpost these to users, as well as improve visibility to search engines. One of our key objectives was to maximise users dwell time and having more information available from the pages helps to motivate users to stay.

Another consideration is the wealth of information that already exists about many key artists — if you consider a significant European artist like Gauguin, there are hundreds of biographies of him that already exist, from a wealth of reputable, knowledgeable and trustworthy sources (and many others that lack those particular qualities). Rather than reinvent the wheel for every artist, we can use trusted sources to give our pages the necessary contextual information; allowing our interpretation to focus more closely on the artist’s processes and approaches to the specific works we have in the collection and the stories that we are best placed to tell.

Selecting Sources, Making Links

Once we had decided to take this approach, we investigated some resources that offered their biographical data in a linked, open manner. In the end, we selected Wikidata (the dataset that powers Wikipedia and its sister projects) and the records from the Getty Research Institute’s Union List of Artist Names (ULAN).

We see these resources as complementary to each other — the Getty carries a certain amount of weight within the heritage sector regarding its reputation as an art historical research organisation. Wikipedia meanwhile, works well as a piece of foundation content and gives us the scaffolding to build upon our biographical entries and focus storytelling around our collections.

Both providers make their data accessible through Open Licensing (Creative Commons and the Open Data Commons license), which means we can make use of this information freely (provided that we always attribute the source).

Our major technical challenge here was to get the information needed to make the links in the first place (the ULAN ID’s for each artist represented in the Getty database, and the Wikidata item ids). We managed to achieve this with some Python programming, which took the list of artists to publish and ran a SPARQL query for each name against the endpoints run by each provider. The results were then Quality Assured to ensure we’d got the right ID — which threw up some problems around some famous names such as Robert Burns (for instance, we wanted the 20th-century artist rather than the 18th-century poet in this case). With the ID’s to hand and recorded in our person authority files, development work was undertaken on the website to allow us to pull the information across into the pages and any updates that come afterwards.

Our New Artist Pages, featuring Getty and Wiki Contributions!

Consumption to Contribution: Identifying Gaps and What Next?

So far, the artist pages have proved to be a valuable resource on the site, with feedback being positive. One exciting result of this work though has been identifying the potential gaps that exist in the common knowledge — artists whom we have information about that are not represented in ULAN or Wiki. This leaves us with an opportunity to embark on new projects going forward, where we can encourage users to help contribute the knowledge we hold to these sources; improving the common, open knowledge around the broader story of art.

The new artist pages can be browsed and viewed by users at https://www.nationalgalleries.org/art-and-artists/artists

--

--