Data Analysis for Ultramarathon Training

Peter Oliver Caya
Pete Caya
Published in
4 min readNov 6, 2017
Dehydrated and tired. Not my best picture, but I’m looking forward to doing the 50K version of this race in March!

Ultramarathon training is an area where I was able to find a limited amount of actual information on what works, and what doesn’t work.

About two weeks ago I decided that in March I would be running my first ultramarathon. This was motivated by a few things. First, I’ve been doing some light marathon running for about two years now and this seems like a nice step up to 50K (31.2 miles). In part, it’s an act of contrition — I missed running in the Rochester marathon this year because of a move, and wanted to pick up something to make up for it.

About the same time that it occurred to me to run this 50K, it also occurred to me to do a bit extra: Curate and analyze a data-set concerning my training.

Why Bother?

The number one heuristic for amateur runners is to just run more miles. However, it’s easy to go out of control with this simple strategy. I’ve fallen into the habit of over-training in the past which has caused the normal collection of symptoms like sleep disruptions, overuse injuries, and fatigue. Have a better track of my methods and progress will also mean that I can also have a better idea of what has a positive impact on my performance versus what just sounds like it would be effective which often seems to be the strategy I see people use.

What Am I Doing Differently?

Keeping track of training is nothing — runners are always told to keep running journals, and we were always told how important it was to keep track of our progress when I was into lifting. I plan on doing a few things differently.

First, I’m going to attempt to keep more expansive and accurate records for what I am doing. This includes the normal information on training frequency, speed, and distance, as well as metrics related to body composition, heart rate during runs, and resting heart rate.

Second, I plan to conduct statistical analysis of my training in order to come to a better conclusion on what does an does not work. In the near term, this will be nothing more than just inferential and summary statistics. In the longer term, I would like to try to use more sophisticated methods like regression analysis and PCA to discern relationships between things like average speed on a distance run and caloric intake.

Data Collection

I have a few different tools to help me work on my goals:

Garmin ForeRunner

This has actually been a tool I’ve used for a few years now. Before that, I just plotted my route on Google Maps, but this gadget made my life so much easier.

Heart-Rate Monitor

This is actually still a work in progress. I’ve been shopping for devices which will be able to provide an idea of my heart rate during runs, and will give me direct access to the data so that I can save it as a CSV file. Unfortunately, some of the choices are around $300, so depending on how things work out, I may end up making this a DIY project.

Software Accoutrements:

Data collection and basic summary statistics will be in Excel. Any heavy duty analysis will be performed in R.

Long-Term Goals:

The Lt. JC Stone 50K Road Race is in about sixteen weeks so I am following this 50K training schedule. I’m actually starting around the third week in the training plan, but I plan on drawing it out a little bit just to add in some extra recovery time into the mix. After this race, I plan on running in the Pittsburgh Marathon in May, and then to pick out a trail run in June.

Keeping records on my training during the next 7–8 months will undoubtedly yield some very useful information for the future and I’m looking forward to this new challenge!

--

--