Apple Health with Python

Find your most active day on Apple Health

Oliver Lövström
Internet of Technology
6 min readJan 28, 2024

--

In this tutorial, we will use Apple Health data to find (a) the most active day, (b) the longest activity streak, (c) trends in step counts, (d) the distribution of daily steps, and (e) recommended Health guidelines.

Photo by Louis Hansel on Unsplash

Export Apple Health Data

Before beginning the analysis, let’s export the Apple Health data:

TLDR:

  1. Access Your Profile: Tap your picture or initials at the top right. If you can’t see your profile icon, navigate to Summary or Browse at the bottom, then scroll up.
  2. Export Data: Select the Export All Health Data option followed by Export.
  3. Transfer and Extract: Once exported, transfer the file to your computer. Unzip it to find export.xml and export_cda.xml.

Python Imports

Import the Python tools:

import xml.etree.ElementTree as ET
import matplotlib.pyplot as plt
from datetime import datetime
from collections import defaultdict
import numpy as np
import scipy.stats as stats

(Optional): Getting the Data Types

Script to find out the available data types:

# Load and parse Apple Health export.xml file.
xml_path = "path/to/your/apple_health_export/export.xml"
tree = ET.parse(xml_path)
root = tree.getroot()

# Create a set to store unique @type values.
types_set = set()

# Iterate through the XML elements and extract @type attribute.
for record in root.findall('.//Record'):
type_attribute = record.get('type')
if type_attribute:
types_set.add(type_attribute)

# Print all unique @type values.
for type_value in types_set:
print(type_value)

In this article, we will look at: HKQuantityTypeIdentifierStepCount.

Extracting Step Count Records

Next, extract the step count records from the Apple health data:

# Load and parse Apple Health export.xml file.
xml_path = "path/to/your/apple_health_export/export.xml"
tree = ET.parse(xml_path)
root = tree.getroot()

# Initialize a dictionary to hold daily steps.
daily_steps = defaultdict(int)

# Parse step count records.
records = root.findall(".//Record[@type='HKQuantityTypeIdentifierStepCount']")
for record in records:
start_date = record.get("startDate")
value = int(record.get("value"))
date_obj = datetime.strptime(start_date, "%Y-%m-%d %H:%M:%S %z").date()
daily_steps[date_obj] += value

Analysis

In this section we will analyze the Apple Health data.

Most Steps in a Day

Are you curious about your most active day? Here’s how to find out:

max_daily_steps = max(list(daily_steps.values()))
max_daily_steps_idx = list(daily_steps.values()).index(max_daily_steps)
max_daily_steps_date = list(daily_steps)[max_daily_steps_idx]

print(f"The day with the most steps is {most_steps_date.strftime('%Y-%m-%d')} with {most_steps_count} steps.)

Running this code will reveal which day you took the most steps:

The day with the most steps is 2019-05-30 with 33387 steps.

Activity Streak

Have you ever wondered about your longest streak of active days? Let’s calculate it:

sorted_dates = sorted(daily_steps.keys())

# Define the target number of steps to consider a day active.
steps = 3000

longest_streak = 0
current_streak = 0
longest_streak_start_date = None
current_streak_start_date = None

for idx, date in enumerate(sorted_dates):
# Check if the steps on the current date meet or exceed the target.
if daily_steps[date] >= steps:
# Increment the current streak.
current_streak += 1
# Set the start date of the current streak.
if current_streak_start_date is None:
current_streak_start_date = date
else:
# If the current day does not meet the step target, then check if
# the current streak is the longest one update the longest streak.
if current_streak > longest_streak:
longest_streak = current_streak
longest_streak_start_date = current_streak_start_date
# Reset the current streak variables for the next possible streak.
current_streak = 0
current_streak_start_date = None

# Check if the last streak was the longest.
if idx == len(sorted_dates) - 1 and current_streak > longest_streak:
longest_streak = current_streak
longest_streak_start_date = current_streak_start_date

print(f"The longest streak of at least {steps} steps is {longest_streak} days, starting on {longest_streak_start}.")

When you run this code, you’ll get information about your longest streak of active days:

The longest streak of at least 3000 steps is 30 days, starting on 2019-07-04.

Daily Step Trends

Have you ever wondered which day of the week you’re most active? Let’s find out with a simple visualization:

weekday_steps = defaultdict(list)

for date, steps in daily_steps.items():
weekday = date.weekday()
weekday_steps[weekday].append(steps)

average_steps_per_weekday = {
weekday: np.mean(steps) for weekday, steps in weekday_steps.items()
}

weekdays = [
"Monday",
"Tuesday",
"Wednesday",
"Thursday",
"Friday",
"Saturday",
"Sunday",
]
colors = ["red", "orange", "yellow", "green", "blue", "indigo", "violet"]

average_steps = [average_steps_per_weekday[i] for i in range(7)]

plt.figure(figsize=(10, 6))
plt.bar(weekdays, average_steps, color=colors)
plt.xlabel("Day of the Week")
plt.ylabel("Average Steps")
plt.title("Average Steps per Weekday")
plt.tight_layout()
# (Optional): Add recommended guideline.
plt.show()

Result:

Image by Author

Step Count Distribution

daily_step_totals = list(daily_steps.values())

mean_daily_steps = np.mean(daily_step_totals)
std_dev_daily_steps = np.std(daily_step_totals)

min_daily_steps = min(daily_step_totals)
max_daily_steps = max(daily_step_totals)

x = np.linspace(min_daily_steps, max_daily_steps, 100)
y = stats.norm.pdf(x, mean_daily_steps, std_dev_daily_steps)

plt.figure(figsize=(10, 6))
plt.plot(x, y, "b-", linewidth=2)
plt.title("Daily Step Count Distribution")
plt.xlabel("Daily Step Count")
plt.ylabel("Probability Density")
plt.grid(True)
plt.xlim(min_daily_steps, max_daily_steps)
# (Optional): Add recommended guideline.
plt.show()

When you run this code, it will generate a distribution plot:

Comparison to Recommended Guidelines

How do your step counts compare to the recommended guidelines? You can add these comparisons to our previous plots:

For average steps per weekday:

# Average Steps per Weekday.
# ...

# Define the guideline.
guideline = 3000

# Add a guideline to the plot.
plt.axhline(
y=guideline,
color="black",
linestyle="--",
label=f"Guideline: {guideline} steps",
)
plt.show()

For daily step count distribution:

# Daily Step Count Distribution.
# ...

# Define the guideline.
guideline = 3000

# Add a guideline to the plot.
plt.axvline(
x=guideline,
color="black",
linestyle="--",
label=f"Guideline: {guideline} steps",
)
plt.show()

If we now run it:

Analyzing Your Health Data

I've made it simple to explore your health data. You can download the code from this GitHub repository.

All you need to do is follow these steps:

$ python health/step_count.py --help

usage: step_count.py [-h] [-m METRICS [METRICS ...]] [-s STEPS] [-g GUIDELINE] file_path

Analyze health data for step count metrics.

positional arguments:
file_path The file path to the export.xml file

options:
-h, --help show this help message and exit
-m METRICS [METRICS ...], --metrics METRICS [METRICS ...]
Specify one or more health metrics: cumulative, weekday, distribution, streak
-s STEPS, --steps STEPS
Specify the minimum number of steps for calculating the longest streak
-g GUIDELINE, --guideline GUIDELINE
Specify the guideline for the number of steps per day

This script allows you to run all metrics on your Apple Health data, including:

  • Identifying your most active day
  • Calculating your longest activity streaks
  • Exploring trends in your step counts
  • Plot the distribution of daily steps
  • Visualize your cumulative step count
  • Compare your data to the recommended guidelines

Conclusion

I hope this article has provided you with insights into your health habits. If you have any other health data you’re curious about or have any questions. Please feel free to reach out or leave a comment.

Further Reading

If you want to learn more about programming and, specifically, Python and Java, see the following course:

Note: If you use my links to order, I’ll get a small kickback. So, if you’re inclined to order anything, feel free to click above.

--

--