Harnessing machine learning and algorithms to make genealogical research accessible to all

City directories digitization, image by the author

Our team at MyHeritage built a powerful genealogy platform and maintains a huge database numbering 12.5 billion historical records that allow people to learn about their ancestors. Users can search this database to discover new information about their families and find photographs featuring their relatives. MyHeritage’s Record Matching technology automatically notifies users when historical records match information in their family trees, saving them the need to actively search the archives.

One of the biggest collections in this historical record database is a structured index of the data extracted from U.S. City Directories. …


Using AI and computer vision in genealogy research

A yearbook is a type of a book published annually to record, highlight, and commemorate the past year of a school.

Our team at MyHeritage took on a complex project: extracting individual pictures, names, and ages from hundreds of thousands of yearbooks, structuring the data, and creating a searchable index that covers the majority of US schools between the years 1890–1979 — more than 290 million individuals. In this article I’ll describe what problems we encountered during this project and how we solved them.

Maksym Chernopolsky

Backend Technical Lead at MyHeritage

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store