Examples of cultural preservation efforts enabled by multimodal AI (from left to right): Lidar scans of ancient trees (source), high fidelity archive of monuments (source), deciphering ancient scrolls from Herculaneum Papyri as part of the Vesuvius AI Challenge (https://scrollprize.org/)

Multimodal AI: A Powerful Tool for Social Good

Stefania
3 min readApr 22, 2024

Summary of my presentation at the Innovation Exchange, April 19, 2024, MOHAI

My cup is full after attending and presenting at last week’s Innovation Exchange at the Museum of History & Industry (MOHAI)! The Pecha Kucha format’s rapid pace sparked lively discussions on the challenges and potential of AI. My talk centered on multimodal AI for social good, drawing from my AI education work since 2016 and highlighting novel applications.

I focused on multimodal AI for social good, sharing my journey of working on AI education since 2016 and highlighting specific novel applications in areas like education, climate tech, biomedicine, and cultural heritage preservation.

Excerpt from GeekWire article about the event https://www.geekwire.com/2024/10-thought-provoking-questions-about-the-promise-and-pitfalls-of-ai/

Multimodal AI, which combines text, image, and audio analysis, has the potential to impact various fields and drive significant social impact. During my talk, I highlighted some of the ways multimodal AI is being used for good:

  • Multi-Language support and preservation: multimodal AI is being used to create speech recognition and translation tools for indigenous languages like Māori and Ainu, as seen in projects like Te Hiku Media and AI Pirinka. Meta’s Massively Multilingual Speech models further support linguistic diversity by recognizing over 4,000 spoken languages. Additionally, multimodal AI facilitates content creation in low-resource languages, ensuring wider access to cultural heritage. However, ethical considerations are crucial, requiring close collaboration with communities to respect their agency and knowledge. Overcoming challenges like limited datasets for oral languages remains essential, with initiatives like masakhane.io, a grassroots NLP community for Africa.
Examples of multimodal AI research projects focused on multi-lingual support: Meta’s Massively Multilingual Speech model, Interleaved Spoken and Written Language Model https://speechbot.github.io/spiritlm/
  • Cultural Heritage Preservation: AI can also be used to analyze and interpret historical texts, images, and artifacts, providing new insights into the past. For example, in the case of the Vesuvius Challenge, a machine learning competition aimed at deciphering ancient scrolls buried by the eruption of Mount Vesuvius in 79 AD. In 2023, the challenge successfully decoded portions of the Herculaneum Papyri, awarding over $1 million in prizes. The 2024 challenge focuses on reading 90% of four scrolls, with new prizes totaling $500,000+. The ultimate goal is to resurrect the entire ancient library, revealing its hidden knowledge.
  • Medicine: Multimodal AI is revolutionizing medicine by integrating diverse data like medical images, clinical notes, and genomics. It improves diagnostic accuracy, as seen in models predicting hypertension with over 75% accuracy (source). It enables personalized "omics" for precision health and aids clinical trial design. Remote patient monitoring and virtual health assistants further demonstrate its potential for transforming healthcare delivery. As research progresses, we can anticipate even more groundbreaking applications of multimodal AI in medicine.
Overview of opportunities in personalized medicine, digital clinical trials, remote monitoring and care, pandemic surveillance, digital twin technology, and virtual health assistants (source)

These are just a few examples of how multimodal AI can be used for social good. As AI technology continues to advance, we can expect to see even more innovative and impactful applications in the future.

Find more examples and references in my slides from the presentation: https://tinyurl.com/multimodalai

--

--

Stefania

Ph.D. Residency in AI / ML: Coding & Program Synthesis @Theteamatx dissertating @UW , alumn @mit @msft https://stefania11.github.io/