When we use Google or many other search engines it is often quite remarkable how it is able to pull up reliable results for almost anything asked of it. On many other websites, say a hastily put together department store website with a search bar, a simple search practically crashes the website. The reason behind this phenomenon has a lot to do with page ranking in more high profile search engines like Google, but also with well-defined metadata and controlled vocabularies.
Controlled vocabularies allow data to be sorted via various relationships, equivalence, hierarchical, and associative. Equivalent relationships deal with the variety of ways to name an object. For example a Fencai vase and a Famille Rose vase are relatively the same thing. Hierarchical relationships deal with ranking attributes or data by importance. Associative relationships group objects that are similar together. Controlled vocabularies are often being used to organize digital assets, research data, collection items, and born digital works in educational institutions such as museums, heritage sites, and cultural repositories. The end goal is to make this data which is often hidden or in many cases primarily seen physically more widely accessible.
There are many different controlled vocabularies that can be used for a variety of different disciplines and some institutions may even opt to create their own vocabulary for internal use. In the art world, the Getty Vocabularies are the most robust vocabularies availiable. The Getty Institute has created for distinct vocabularies which are constantly being updated, amended, and translated into multiple languages. AAT (Art and Architecture Thesaurus) is the Getty’s oldest vocabulary and consists of generic terms for describing art and material culture. TGN (The Getty Thesaurus of Geographic Names) lists the names of historical places, architecture, and geographic features. ULAN (union list of artist names) lists artists, collectors, patrons, or groups involved in art or material culture. Lastly, the newest vocabulary, CONA (cultural objects name authority) deals with built and movable works. The vocabularies have the ability to link up to each other. For example an artist on ULAN may be located in an area described on TGN and use a material listed on AAT. This spiderweb of communication between the vocabularies allow researchers a way to navigate seamlessly through related data sets. Vocabularies are built through a variety of partnerships with libraries, indexes, and databases. Contributors give new vocabulary in batches for the Getty to add to their list.
The Getty institute has made all of its vocabularies available for download to anyone with the ability to support them through the concept of linked open data. With linked open data, items that are “linked together conceptually” are open to use by anyone. For many institutions, this means that there is an opportunity to use the Getty’s controlled vocabularies free of charge, providing their systems can support it. Combining controlled vocabularies with existing metadata creates a much more accessible repository with common relationships and variances already implemented. In order to create a digital collection that responds to the queries of the majority of visitors, a controlled vocabulary is a must.
Article by — Christopher Rahmeh