Carbine. The Image Tagging Horse.

Adam Moriarty
AMLabs
Published in
7 min readFeb 25, 2019

Using a Facebook chatbot to crowdsource user tags for Collections Online.

I created a simple Facebook Messenger chatbot that shows users various images from the collection with captions created by the Microsoft Vision image recognition service. Users were asked to check the caption and provide tags. Unsurprisingly humans were much better at captioning images compared to the Microsoft AI Vision service.

This forms part of a talk I gave last year on how museums can tackle the crisis of capacity that we are currently facing.

“The biggest issue facing the business of museums and public history is not dropping visitation numbers, but a crisis of capacity.”

Taylor Stoermer

Background

I've been using machine vision to help auto-caption images from the backlog for a while now. Whilst relatively simple and easy, I quickly discovered that the captions are often inaccurate, and we needed some form of quality control. At first, I looked into the gig economy as a solution but ran into a range of ethical issues so decided to look for an alternative.

It seemed last year that chatbots were all the rage, the latest tech buzzword. In 2017, the number of chatbots Facebook Messenger supports grew to 100K. For museums, chatbots could offer ways for us to enhance the audience experience. We have seen Objectphone, Send Me SFMOMA and The Voice of Art allow serendipitous exploration of collections and provide innovative ways to engage new audiences.

The Carnegie Museum released research on the use of mobile devices in museums. It showed that 93% of people have a smartphone on them when they visit and 92% have Facebook Messenger installed on their phones.

This research presents us an opportunity for us to use an existing well-established technology to provide a fast and convenient way for us to extend our cataloguing capability.

After reviewing several systems, I decided on Chatfuel; it was free, simple, and easy to integrate with Facebook Messenger. There were essentially six steps I needed the system to complete:

  1. Greet the users.
  2. Show the users an image that was auto-captioned by the Microsoft AI, then the auto-generated caption.
  3. Ask the user if the caption matches the image.
  4. If it does, move onto the next image.
  5. If the caption doesn't match, ask for three keywords that would describe the image. Then move onto the next image.
  6. When the user is finished, provide a ‘thank you’ for their effort.
A sample interaction.

The Chatfuel UI allows you to drag and drop content, quickly add rich media, and run simple tests. I ran a pilot for two weeks using the staff and volunteers in the museum tea room as guinea pigs. To keep it simple I only used the very basic natural language processing that Chatfuel provides, and relied on buttons and quick replies to keep the conversation moving (this did lead the pilot to feel more like ‘filling in a form” rather than a conversation). Every user was shown the same images in the same order, this way I could aggregate and compare the results, and the results were exported to a Google Sheet for review. I read several articles that recommended that chatbots have a personality, so decided to choose Carbine, a stuffed horse in the collection. This allowed me to make a few equestrian jokes within the chat and give the users a character to talk to.

Results

The tags added by the Microsoft Image Recognition System.

In the two weeks running in the staffroom, we had 41 users tag items. Almost everyone completed the first two images, 90% did three and then we saw a steep drop off with only 25% of users completing 5 images. The users added rich tags, latin names for plants in images, names of subjects, and even the address of one location (as expected from a group of professional cataloguers!). What was fantastic was seeing the different disciplines within the organisation tag the photographic collection with their own vocabularies (eg. the difference between a botanist or teacher) and to wonder just how rich the content might be if we were to open this tool up to the public.

The tags from the chatbot

I didn’t expect that the system would be so simple and easy to use with its integration with Facebook, seamless. I didn’t require any coding or technical skills to make it work and realistically it only took me a day of work (or less) to get the pilot up and running.

After the pilot, we spoke with many of the participants and we kept getting the same feedback:

“I’m feeling a bit used …”

Perhaps the simple, easy to use ‘cookie cutter’ system does have its limitations. It creates a form-filling design that only works to a point and we need to think of ways to make the conversation more engaging, or to provide some form of reward for participants. Maybe we could include rich content or stories from the collection, or somehow show the user how they have contributed to the cataloging effort? As a first step this was remarkably quick and simple to get up and running and provides an interesting insight into how we might use a new technology to keep pace with our increasing workload.

What's Next?

Obviously the target group of museum staff meant that these results aren’t perhaps a great representation of how the bot would work in the wild. However, with this pilot completed, we are looking at other crowdsourcing platforms, such as Zooinverse, that already have an existing group of passionate and enthusiastic users who might enjoy the challenge of this work. Having seen the content generated by Museum specialists from different disciplines, I would also like to consider how we could use the chatbot for highly targeted crowdsourcing (maybe at events, or within a particular location) to see how it would cope with a larger, knowledgable and equally engaged audience.

As with all the projects we have conducted so far, the aim is to open up the uncatalogued archives and share them with the widest possible audience — meeting people where they already are, using technology they already have.

The details below this point are slightly nerdy and explain how the Chatfuel system work…you have been warned.

How does it work?

There are four main components of the chatbot: Cards, Blocks, System Attributes and Custom attributes.

Cards are the most basic building components of a bot. They can contain anything from a simple text message or image, to advance plugins which allow more advanced actions.

Blocks are containers for one or multiple cards. When a block gets triggered, a user will receive all the cards which are contained in the block. Blocks can be linked together to create the flow through the conversation. I used three blocks for Carbine, each containing multiple cards: a welcome block containing instructions, a block containing the images, captions and user-created tags and a thank you block.

Left: The welcome block showing the multiple cards in Chatfuel. Right: The same Block as seen by the user.

To help personalise the experience the system provides access to system attributes; the information the bot receives about a user from Facebook. You can’t edit the data but you can use information such as the user's name, gender, location or facebook profile picture within cards.

The user data that Facebook provides (from Chatfuel manual)

Custom attributes allow to you save inputs from the users — in my case these were the suggested tags. Using an existing plugin these custom attributes would be added to a Google Sheet once the user entered them.

Easy!

--

--

Adam Moriarty
AMLabs
Editor for

Museums, Digital stuff, Linked Data, Open Access, Head of Information + Library @aucklandmuseum