Our team at MyHeritage built a powerful genealogy platform and maintains a huge database numbering 12.5 billion historical records that allow people to learn about their ancestors. Users can search this database to discover new information about their families and find photographs featuring their relatives. MyHeritage’s Record Matching technology automatically notifies users when historical records match information in their family trees, saving them the need to actively search the archives.
One of the biggest collections in this historical record database is a structured index of the data extracted from U.S. City Directories. …
A yearbook is a type of a book published annually to record, highlight, and commemorate the past year of a school.
Our team at MyHeritage took on a complex project: extracting individual pictures, names, and ages from hundreds of thousands of yearbooks, structuring the data, and creating a searchable index that covers the majority of US schools between the years 1890–1979 — more than 290 million individuals. In this article I’ll describe what problems we encountered during this project and how we solved them.
Backend Technical Lead at MyHeritage