Turbo Charge Your Keywording with LightRocket’s Magic Keyword Thesaurus

Yvan Cohen
LightRocket Enterprise
4 min readOct 17, 2023
Photo by lil artsy

A good keyword thesaurus is every archive manager’s dirty little secret. It is the unsung hero of your archive, the beating heart of a well-managed searchable collection.

To further develop our physiological analogy — if your pictures are the flesh and bones of your archive, tags are the cells that bring your collection to life; making files easily and accurately retrievable which is, after all, the ultimate goal of digital asset management.

Given the recognized importance of effective keywording (some people call it tagging) it is strange, therefore, that the value of a thesaurus — a tool designed to ‘turbo charge’ your keyword selections — is so often overlooked.

Now I’ve got your attention, you’re probably wondering what exactly a keyword thesaurus looks like and how it works.

What is a keyword thesaurus for digital archive management?

Put simply, a thesaurus is a database of terms that you can select from to keyword your files. So far so unimpressive, I hear you say.

The beauty of a thesaurus, however, lies in how all those precious terms are organised. The magic of well-structured thesaurus is derived from the hierarchical relationship between terms, which cascade downwards from general categories (like health or conflict) towards ever more specific terms. In Thesaurus-speak, a broader category like health would be the ‘parent’ term of more specific ‘child’ terms like diseases, which in turn cascades down to their own ‘child’ terms which would be a list of names of specific diseases.

As unexciting as it may sound, it is the hierarchical connections between terms which form the basis of an effective thesaurus, making keywording not only faster but more comprehensive.

In practice, this means that if you select a specific term, your thesaurus will offer up all its related parent terms. For example, if you need to pick a disease name like malaria (say you have a picture of a patient with malaria in your image), the thesaurus will automatically add ‘parent’ terms like ‘diseases’ and ‘health’.

Added benefits of a good keyword thesaurus

It doesn’t stop there. A powerful keyword thesaurus will not only add the keyword you selected and its parent terms, it will also contain a database of alternate terms, ‘use-fors’ and synonyms. These will have either almost identical meanings or are alternative forms of the same word (so in the case of a disease name, the thesaurus will also add common and scientific names). This means with just one term selection you will be adding multiple relevant keywords (synonyms and alternate forms) to your file. That’s a lot of value for a single click of your mouse!

The other rather neat thing about a good hierarchical thesaurus is that by forcing you to browse terms in a ‘tree’ structure (terms literally branch out from the main trunk of general categories), you will be prompted to think of other related and relevant terms — further enhancing and enriching the keywording process.

As valuable as they are to the management and effective indexing of digital assets, the keyword thesaurus remains a surprisingly rare beast among digital asset management (DAM) systems.

Photo by Tara Winstead

Trying to fill the gap with AI just doesn’t cut it

Many platforms try and paper over this gaping hole with the promise of automated tagging, driven by artificial intelligence (AI). As promising as AI tagging may sound, however, the reality of keywording still means we need a human to determine the meaning of a picture and thus the relevant terms that should be associated with it.

So, while a thesaurus will certainly make keywording easier and more comprehensive, it can’t automate the process. You still need a good old human to decide which terms to add.

Keywording is something of an art, that requires judgement. It is a process that you will always be refining and improving as you learn which keywords work best for your collection. A good thesaurus is also very much an organic entity that should always be evolving and growing as you discover new terms to add.

You will sometimes hear a thesaurus referred to as a controlled vocabulary, this is because the relationship between terms needs to be carefully balanced and, yes, controlled. While it may be tempting to open up your thesaurus to a free-for-all of suggestions from your staff, it is wise to remember an effective thesaurus is a structured database of vocabulary that needs controlling. Each new term addition should be reviewed and approved before it is added to your thesaurus.

Lastly, it’s worth thinking of thesauri as entities that are specific too. At LightRocket, for example, we have developed a thesaurus with over 40,000 terms (including alternate forms and synonyms) that has been specifically developed (over almost a decade) with the classification of pictures in mind. There are other thesauri which are focused, for example, on medical terms or you could develop your own thesaurus to meet the needs of your industry or organisaition.

Interested in getting to know more?

Get in touch with us today to find out how we can help improve your digital asset management.

--

--

Yvan Cohen
LightRocket Enterprise

Yvan has been a photojournalist for over 30 years. He’s a co-founder of LightRocket and continues to shoot photo and video projects around South East Asia.