Introduction
The amount of data an individual or company generates everyday is growing exponentially, and with it, the complexity of analysing this data. This article aims to demonstrate one of many use-cases of Splunk as a data analysis tool. I will explain the process of exporting health data from from Apple’s Health App and, then analysing and visualising this on Splunk
What is Splunk?
Splunk is a software platform used to search, analyse and visualise the machine-generated data gathered from websites, applications, sensors and devices.
What is the Apple Health App?
Health App is the iOS health informatics mobile app used to track the user’s health and physical metrics such as steps taken, body weight, heart weight and mood. The app works by collecting data from smart wearbles, phones and other third-party data sources authorised by the user. For the purposes of this article I will be using my iPhone which collects data from itself and my Apple Watch.
Although the Health App already displays some visualisations about the data collected; it is very generic and may not answer specific questions such as:
- How many steps in total did I take?
- What is my most active period during the week?
- What is my overall trend?
- What was my most active day?
- When was my first step with my iPhone? and possibly where?
These are all interesting questions which I hope to answer using my data and Splunk so let’s get started.
Overview of process
- Export Apple Health Data
- Convert data to CSV
- Input data into Splunk
- Analyse data
- Visualise data
Exporting Health Data
To export the Health data open the Health App on your phone
- Select profile on the top right corner
- Scroll down to Export All Health Data and click “Export” when prompted
- Wait for exporting to complete
- Transfer data to your laptop. Email, air-drop , add to iCloud drive etc
- You should see an attachement called export.zip.
- Unzip this and locate the export.xml file
- Export complete. We will now convert this file to CSV
Converting .XML file to .CSV
To convert this file we will use a Python script called applehealthdata.py which can be found on Github and was created by Test-Driven Data Analysis.
- Copy the code for applehealthdata.py
- Paste onto any code editor and save in the same location as the .XML file and name it applehealthdata.py
- In the terminal cd to the directory of the above files
- Run the following code:
applehealthdata.py export.xml
and you should see the following output
Reading data from export.xml ... done
The script has now completed the conversion and you should see many .CSV files in the original directory. You can have a look at any of these but we will be focusing on StepCount.csv
Input Data Into Splunk
We can now input our converted data into splunk for analysis. I will not be going over the installation of Splunk on this article but you can find that here.
New Index:Select Settings > indexes > New index
- Index name (apple)
- Save
New App:Select Apps > Manage Apps > Create App
- Name (Apple Health)
- Folder Name (apple_health)
- Save
Input data
Splunk works on time based and needs a timestamp of the data when it is indexed. The “endDate” is one of the field headers of StepCount.csv. We need to setup a new sourcetype “csv:apple:health” with the “endDate” as TIMESTAMP_FIELDS.
Select Settings > Add Data (icon) > Upload > Select File
Next > Change timestamp > Advanced >
- Timestamp field (endDate)
- Save as (csv:apple:health)
- Next
- Host field value (iPhone or Applewatch etc)
- Index (apple)
- Review
- Submit
The process is now complete. We can now use Splunk’s Search Processing Language (SPL) and query some searches
Analysing The Data
First Step taken with iPhone
index="apple" sourcetype="csv:apple:health" source="stepcount.csv" host="Alqanit’s Apple Watch"
| stats earliest(_time) AS _time
SPL used for Splunk Search.
Total number of steps
index="apple" sourcetype="csv:apple:health" source="stepcount.csv" host="Alqanit’s Apple Watch"
| stats sum(value) AS TotalSteps
Highest steps in one day
Most active hour/days in the week
Conclusion
There are many ways to improve this project below are some of the improvements that can be made
- Have a clear goal for why you want to analyse your data. For example, find correlation between you weight gain/loss and your daily step count
- Automate the exporting and converting of data by using methods such as “HTTP Get input”, iPhone shortcut automation or iPhone triggers to send the data after a certain time or any other metric
- Using the iOS App: “Health Auto Export” to automatically collect Apple Health data and send to Splunk
- Using a combination of third-party apps such as “Myfitnesspal” and “Strava” to get a wider picture of your physical and health habits
- Use Splunk to find your location hot map. Find the areas where you take the most steps
- Find accompanying photos of specific days such a picture from the day you took your first step with your iPhone
- Incoorprate medical health data from your medical provider to see correlation between illness and daily activity
- Find the times where your heart rate was abnormally high and find pictures of these moments to figure out what caused such spikes
When it comes to analysing your data; the sky is the limit. You can answer specific questions about your daily habits and make improvements in your life and ,along with this, you can track your goals and keep yourself accountable.
Thank you for reading and now Go Splunk Yourself!