On artificial intelligence, museums, and garbage

Ariana French
6 min readMay 7, 2018

--

Since the 1960s, “garbage in, garbage out” (GIGO) has been a familiar concept to technologists, scientists, and mathematicians. It’s a simple idea: The quality of output is directly related to the quality of input. For example, if you want to predict the price of cars and use inconsistent date formatting (MM/DD/YY vs. YYYY/MM/DD etc.) to describe historical trends, you’re going to get bad output mud instead of good output honey. It’s a problem that plagues everyone from NASA to Coca-Cola.

In AI and machine learning*, the quality of input is massively related to the quality of output on a scale that’s hard to imagine. It takes a lot of data for AI to create insights superior to those made by humans — like housing sales, cancer detection, or whether Oasis will ever really, truly get back together. And then it takes a lot of computing power to create weird and wonderful and impressive results, all directly informed by the quality of the data used in the process.

https://xkcd.com/1875/

While AI’s hunger for big data is getting smaller all the time, sampling bias (“garbage in”) continues to dog the AI community and propagate GIGO. A limited set of images led to bias in Google’s hugely popular Arts and Culture app, where the selfie “lookalike” feature returned results that weren’t so good for people of color. In another example, FaceApp (a selfie-driven mobile app) used facial image filters labeled “Hot” that lightened skin. The CEO blamed the racially insensitive feature on the AI model used and as an “unfortunate side-effect of the underlying neural network caused by the training set bias.”

The impact of bias in these apps (and others) isn’t good, but it’s not fundamentally changing lives, right? This isn’t the case elsewhere. AI is used as a healthcare diagnostic aid, in judicial criminal sentencing, and as a toolkit in community policing. The number of emerging use cases for AI across civic and healthcare sectors is booming. Yet even here, in these platforms born of intense research and multimillion dollar budgets, bias lingers. The results affect lives, families, and communities…and too often for the worse.

OK fine, you might be thinking. Let’s get to the point. Why should museums care about AI and GIGO?

Two reasons museums should care about AI and GIGO:

  1. Collections
  2. Constituents

Collections and constituent management — two operations at the heart of museum missions — are increasingly augmented by AI. Some food for thought:

  • Voice-enabled everything. Building informative, engaging web sites and digitizing collections have been popular museum tech topics for years. Guess what! All of this data is the foundation for voice-assisted tech, like what’s offered through Amazon’s Alexa or Apple’s Siri (or your vacuum cleaner or TV). Voice-assisted tech is popping up everywhere and growing rapidly. If a museum’s online information is haphazardly structured or inaccurate, voice queries might bring poor results to the audiences museums want to reach. (GIGO!)
  • Art authentication. Traditional methods of verifying maker attribution are fragmented and costly. But if forgeries aren’t caught, it can lead to an organizational crisis. Through a combination of AI techniques, a team of researchers and art restoration experts recently created a model that could distinguish between forgeries and the real deal.
  • Identification of specimens in natural history collections. Specimen collection is a foundational practice for natural science research. It takes a many beetles and butterflies for scientists to generate insights into everything from climate change to biopharmaceuticals to molecular evolution. (The sheer volume of specimens — along with time, limited resources, and methodologies — have created digitization and identification backlogs that hurt my soul just thinking about it.) Brighter days may be ahead with some help from AI. In 2017, a team of researchers at the Smithsonian used machine learning to dramatically speed up the process of identifying botanic specimens with over 90% accuracy.
  • Growing a visitor and donor base. Organizational constituent data is a rich source for AI-assisted insights, at a time when it’s needed most. Cultural orgs are grappling with the fact that attendance is not keeping up with population growth along with questions about museum identity, engagement, and reinvention. Building new audiences and donors through marketing and prospect research is a part (though not the whole solution) of meeting this challenge into the future. AI can help identify new donor profiles and other prospect areas, cutting down the time and resources it would take to do this manually. Not surprisingly, AI-supported tech is the new hotness for marketers and institutional advancement strategists.

When it comes to GIGO and the perils of bad output, how can we mitigate current and future risk? Some more food for thought:

  • Structured content is AI’s trusty friend in voice-assisted queries. AI uses online content as fodder for voice-generated search (and SEO nirvana!). The more structured it is, the better AI will be at understanding your museum’s information in voice-assisted queries.
  • If you have a public collections API — especially one offering up images of people — check it for diversity and inclusion of under-represented populations. And if you contribute human or demographic data to a larger repository, think about screening your data for bias. It’s not like you had anything else going on this Saturday…
  • Consider AI-assisted tech (including those you already use) and data use policies as part of museum strategic planning. Museums can play a key role in supporting transparency in data use and by examining our public data sets for bias. There are promising efforts underway to support fairness and standards of transparency in machine learning.
  • Do you use an AI-assisted software platform or partner with others who do? (Like this one, this one, or this one.) You can research the AI policies and algorithms they use to understand potential impact on organizational goals.
http://knowyourmeme.com/photos/1070702-im-not-a-robot

With growing awareness of embedded bias comes ideas for improvement. The future doesn’t have to resemble Westworld or a dystopian Will Smith film festival. I’m hopeful that AI’s “black box” of algorithms will gain transparency and become more fairly designed, over time. But until regulatory standards are widely adopted by those who build AI tech, education and vigilance are the responsibility of everyone.

A robust online presence and digitization of collections — mission-centric imperatives in museums for years—have benefits beyond expanding access to museum collections and research. It’s also foundational data for AI. As our culture evolves its digital literacy and channels for engagement multiply, our ability to engage and reach new audiences will be essential to mission fulfillment. More than ever, today’s strategic planning and data stewardship affect the future of museums in ways we’re just beginning to imagine.

In the meantime, let’s take the garbage out.

This was think piece #2 in the “On artificial intelligence, museums, and hot dogs” series. Thanks for reading! Thoughts? I’m at @CuriousThirst.

*The term “AI” is a little overplayed. …OK, a lot overplayed. “Machine learning” strikes me as a more accurate term to describe much of what I talk about here. Please forgive the shorthand.

--

--