KaoKore: New Facial Dataset from Japanese Scrolls for ML

Synced
Synced
Mar 2 · 4 min read

Clash of Clans wasn’t a video game but rather a way of life in 10th century Japan. And of course the locals did not scroll through content options on their smartphones as we do now, rather they read actual scrolls. Japanese Emakimono (絵巻物) illustrated handscrolls and Ehon (絵本) picture books were popular storytelling media during the arts-and-culture focused Heian period. Scrolling through the text and images brings dynamic characters and vivid scenes to life in a calligraphy-captioned experience that is as close to cinematic as the tech of the time enabled.

With declining numbers of art historians who can understand traditional Japanese scrolls, preserving the media and messages is a challenge. Working on the premise that facial expressions offer especially rich information not only about the scroll’s content but also about how these artworks were created, a team of researchers from the ROIS-DS Center for Open Data in the Humanities (CODH), University of Cambridge, Google Brain and MILA has introduced a dataset of faces extracted from such pre-modern Japanese artwork.

The KaoKore dataset includes 5552 RGB image files drawn from the 2018 Collection of Facial Expressions dataset of cropped face images from Japanese artworks. For use in supervised learning scenarios two sets of labels have been applied to the faces: male and female for gender, and social status classes noble, warrior, incarnation, and commoner.

The researchers have ensured the KaoKore dataset will work under different machine learning setups. They standardized image size and aspect ratio to 256 x 256 pixels. The images are formatted like those in ImageNet, enabling KaoKore to serve as an alternative dataset under existing unsupervised learning setups.

The researchers applied various generative models to KaoKore with results suggesting the dataset’s suitability for creative tasks. The SOTA GAN model Style GAN for example generated characters that aptly reflected the class variety in the dataset. Leveraging neural painting models meanwhile can create painting sequences from a single KaoKore image to offer insights on artistic technique and style by illustrating painting steps.

Understanding an Emakimono handscroll also requires reading the cursive texts that tell the stories. These are presented in the kuzushiji writing style that was used in Japan from the 8th through 19th centuries. Today however, only trained experts can read them, so the researchers used machine learning to automatically recognize and transcribe kuzushiji into modern Japanese characters.

Early work bridging machine learning and Japanese kanji characters drew on the digitalization of some 300,000 old Japanese books the National Institute of Japanese Literature (NIJL) and other institutes began in 2014. Bounding boxes were created for each character during the transcribing process for some of the books. The CODH researchers who curated the dataset suggested creating a separate dataset for bounding boxes could help machine learning techniques push automated transcription performance.

In 2018, CODH and researchers from the Royal Grammar School, National Institute of Japanese Literature, MILA, and Google Brain created the Kuzushiji-MNIST, Kuzushiji-49, and Kuzushiji-Kanji datasets. In 2019, almost the same group of CODH and MILA researchers proposed KuroNet, a new end-to-end model for kuzushiji recognition.

Machine learning tailored datasets and techniques can be expected to broaden research into and preservation of the pre-modern scrolls that are a huge part of Japanese art history.

The paper KaoKore: A Pre-Modern Japanese Art Facial Expression Dataset is on arXiv. The KaoKore and Kuzushiji-MNIST datasets are available on GitHub.


Journalist: Fangyu Cai | Editor: Michael Sarazen


Thinking of contributing to Synced Review? Synced’s new column Share My Research welcomes scholars to share their own research breakthroughs with global AI enthusiasts.


We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.


Need a comprehensive review of the past, present and future of modern AI research development? Trends of AI Technology Development Report is out!


2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.

Synced

Written by

Synced

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global

SyncedReview

We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.

More From Medium

More from SyncedReview

More from SyncedReview

2018 in Review: 10 AI Failures

1.5K

More from SyncedReview

More from SyncedReview

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade