Visualizing my ancestry on the map

Yannick Brouwer
Draft · 5 min read

Last year I decided to do a little experiment and order a DNA test on 23andme.com. I’m actually quite skeptical about these services but wanted to try one out to better understand the implications of using one of these services. After receiving a testing tube I filled it with saliva and send it to a local lab.

After several weeks my ancestry results came in, and as could be expected the results were quite boring. 99.8 percent was tested European and 0.2 percent was tested broadly East Asian.

40 percent of my DNA was of “French & German” descend, which also includes the Netherlands where I was born. More interestingly, 23andme mentioned that they expected that I would have a direct ancestor from Great-Britain and Denmark between 3 and 5 generations ago.

This made me think that it would be pretty interesting to see how my ancestors migrated through the ages to see how I ended up on this planet. I was excited to start researching my ancestry and see if I could visualize that data.

MyHeritage

I compared different sites and decided to go with MyHeritage because they have good integrations with birth and marriage registers in The Netherlands. In addition it’s widely used which means that there are already a lot of existing family trees that can be used.

Fun fact is that the mormon church is one of the initiators of digital ancestry research. They invented the GEDCOM fileformat back in 1984 and their ‘FamilySearch’ database is used by a lot of commercial ancestry websites like MyHeritage.

The Church of Jesus Christ of Latter-day Saints is the primary benefactor for FamilySearch services. Our commitment to helping people connect with their ancestors is rooted in our beliefs — that families are meant to be central to our lives and that family relationships are intended to continue beyond this life.

Because of privacy, MyHeritage doesn’t let you find people that are still alive. To succesfully start digging it’s important to start by manually adding a generation that is not longer alive. In my case these are my great-grandparents.

During a few evenings I was able to collect all my direct ancestors up to 7 generations ago( between 1750–1800). In a few branches I was able to find up to 16 generations. I found out that the 5th to 7th generation are easiest found because they are shared ancestors with a lot of people, while still being well-documented. Older generations are often harder to find because documentation is lacking or incomplete and last names were not really a thing back then.

This is what all the branches look like plotted over time. Made by importing the GEDCOM file in https://learnforeverlearn.com/ancestors/

Plotting on the Map

Static

After finalizing my research I exported a CSV file with direct ancestors from the MyHeritage ‘Family Tree Builder’ desktop software. I imported the file in Google Sheets and used the ‘GeoCode by Awesome Table’ add-on to convert placenames to coordinates.

Processing is my favoriete piece of software for quick visualizations. I previously worked with the Unfolding Maps library by Till Nagel and created a quick plot of all known locations on the map at the same time. I didn’t find any ancestors from Denmark or the UK as 23andme promised, but I did find ancestors from Belgium, Germany and Spain.

You clearly see the divide between my moms side of the family from the north and my fathers side of the family from the west.

Dynamic

Now I wanted these little dots to pop up at the right moment in time, and move on the map. Each dot representing a direct ancestor moving through life. For many of them I would have three known locations and timestamps:


Because many ancestors stayed in the same town for years this is good enough for now. In the future accuracy of data could be improved by using the location and time of birth of children as additional datapoints for their parents.

Rather than reinventing the wheel, I found Will Geary’s excellent TransitFlow visualization. It uses Python to retrieve and wrangle transit schedule data from Transitland and a Processing sketch to visualize it on a map.

After generating a Processing sketch with his Python script, I started to adjust the sketch to make it work with my data. TransitFlow was meant to visualize 24 hours of data while I was interested in several centuries. Rather than letting the dots follow a complex curved route, Will’s data is chopped into little pieces with a start and endpoint. These little pieces are all straight lines between point A and B.

This is an example of an ancestor that moved to the USA after his marriage.

I decided to reformat my CSV file in a similar way where the first line would start with birth and end with marriage, and the second would start with marriage and end with death. For the Processing sketch to work, I also needed to get the duration of each section. I could easily calculate by subtracting the starttime from the endtime in Google Sheets.

After adjusting the dataset and tweaking the code I made this first visualization:

This example is pretty basic and it’s lacking a year indication but I liked how the dots move over the screen like little ants.

Interactive

After these initial results I wanted to be able to play around with different parts of the animation more easily. I therefore decided to create a timeline that would allow me to easily pause, zoom in or scrobble through the animation.

I noticed that using a single static viewpoint doesn’t do justice to the different patterns in the animation. For example one of my early ancestors moved from Spain to the Netherlands, which is quite a large movement. Later on the most interesting movements happenened often within the same country or even province. I therefore created a way to make keyframes that allows for changing zoomlevels and framing at specific moments on the timeline.

This is the final animation that I made:

Download

You can find the code for this project on my GitHub. Let me know if you did something interesting with it by sending me an email or commenting below.