Reflecting Through My Own Data
On the final day of 2020, I will look back through the data that is most plentiful and accessible to me…my personal data. In many ways, visualizing my own data is a more efficient and effective way of introducing myself.
I have two main takeaways in the days I have spent sifting through this data:
- When you have a deep familiarity with a topic (and I happen to be an expert on my own activity), the number of stories you can tell with the relevant data is nearly endless.
- The big tech companies definitely have the ability to know more about their users than the users know about themselves.
In my case, these points especially apply to my Google data. And while data privacy issues become more apparent as each month passes by, I do appreciate access to my records, especially when I am trying to reflect, revisit or improve upon something in my past.
I will explore data that dates back up to six years, but I will focus on 2020 since it is very apparent how this year was unique.
I’ve tracked my location history for the last 6.25 years:
I have definitely made my way around Manhattan but apparently I have no use for the Lower East Side (or most of the other boroughs).
In terms of where I’ve lived, my move to Durham has been my first departure from the northeast.
This data can also be used to tell the story of a single trip. For instance, here is my birthright trip to Israel from 2018:
My other personal datasets that extend back multiple years are mostly related to my “health.” Well, at least my weight and my movement. I’ve started to try tracking my diet at various points…but it’s much easier to just commit to stepping on a scale every day.
2015 and 2016 demonstrate what can happen when you spend much of your time at baseball stadiums. 2017 shows what can happen when you start strength training. 2020 depicts what can happen when you begin to bulk for two months and then a pandemic causes quarantines and gym closures. (I’m told there is not much historical data on this last situation.)
My weight has not correlated with my step counts, which have generally been all over the map:
You can see the height of the shutdown in New York in April 2020, but I find it interesting that from May through the summer, my median step count was as high as it had been in four years, even if my average step count was nothing special. This emphasizes how I had to focus on a consistent cardio routine with a lack of other options (fitness-related or otherwise).
Four of the five months I was able to average over 10,000 steps per day came in the summer of 2017. And just so it’s clear that I have not manipulated any of this data, in April 2019, I fell short of this goal by averaging 9,999 steps per day.
It’s also interesting to note that my step counts did not have a strong correlation with my “flight count” (defined as every 10 feet of elevation).
I prefer to spend my summers walking (and running) on flat surfaces. I like to spend my Novembers sitting and lying down on comfortable surfaces.
However, being in North Carolina this year, I went on more runs than usual this November.
Many of those shorter runs in November were part of the “Covid-Test Mile.” This is a great biweekly tradition at Duke where you run to campus to get your covid test (and the distance just happens to be one mile). Who needs Fuqua Fridays?!? I am told my personal record (7:51) is actually the all-time record for this race! Although, I’m not sure I trust Strava on that one…all my other times were at least 20 seconds longer. At least it is pretty evident that my runs have improved (both in terms of time and speed) as the weather has grown cooler.
Before I wrap up my “health” data, I would like to take this opportunity to complain about how there is no way that my scale accurately keeps track of the body composition measures outside of my weight.
Something tells me weight and lean body mass are not always going to be complete mirror images of each other, especially when weight is gained (and lost) in different manners. Their needs to be a place on the Internet that reviews scales (or other tools that track body composition) where users just drop their data. Seeing a chart like this in advance would be much more valuable to me than some one-star review by a disgruntled Amazon reviewer.
Music Streaming Data
I guess this will be my own personal Spotify Wrapped. The data I was able to obtain spans 13 months from November 25, 2019 to December 27, 2020. In some instances, I will use just the 2020 data, but the extra month is important in further conveying how this was not a typical year of streaming. In fact, by combining my data with the statistics from my Spotify Wrapped stories (which only includes data from the first 10 months of each year), I estimate that I spent 45% less time listening to music this year than in 2019 (and more than 50% less than in 2018 or 2017)!
It is not difficult to figure out when covid restrictions were put into place. That sharp drop took place between the second and third weeks of March. In many ways this goes hand-in-hand with my movement data. I generally listen to music when I am active (either walking or exercising). In fact, look at how this graph lines up with my step count chart:
They sync up to each other pretty well. And keep in mind that the data comes from two completely different sources.
Since Sunday nights are prime time to either scurry to get work done or to mindlessly relax before the week, that would be an example of a very off-peak listening hour for me. Although, some would say that days of the week were just constructs in 2020.
The main trend here is that I’m most active in the late afternoon.
Alright, now I will briefly touch on the actual content of the music. But it is very important that I give a disclaimer.
I do not have “good” taste in music. In fact, I may have sent you here to read this if you asked me the dreaded question about what kind of music I like.
I sample a bunch of songs on Spotify, pick the ones that get me going and make them into a playlist. I pay very little attention to the Artist (or genre, but it’s mostly pop). (Unless it’s the great Kelly Clarkson…but unfortunately she was nowhere to be seen on my 2020 playlists.)
After I start to grow tired of my current personal playlist, I start using Spotify’s trending playlists and it’s algorithm-generated recommended playlists to save new songs that I like. After I feel I have a sufficient list of new songs, I go back through every single song on my last playlist, removing the songs that I no longer enjoy. Then I create my new playlist. This whole ordeal — rather, process — takes around a month.
In fact, you can see my process of creating playlists from this graph:
The dips in average minutes combined with the spikes in average count represent the weeks in which I am skipping through songs to curate a playlist. You can see this has taken place on three occasions in 2020. I realize this is very inefficient, so if Spotify or anyone else can offer an algorithm that can determine when I get tired of a song, it would be greatly appreciated.
Anyway, back to looking at the content…
It was difficult to determine the metric to use to rank my most-listened-to Artists of the last year. Total minutes is slightly biased by length of song and number of streams (plays) may be biased by songs that I skip through or by the playlist-creation process. I decided to use a combination of both measures to rank my top Artists of the last 13 months.
Since I almost never choose particular songs to listen to (I just hit shuffle on my playlist), differences over time are largely just due to randomness and playlist changes.
My most-listened-to songs are essentially those songs that remained on my playlist the whole year. In other words, none of them actually could have debuted in 2020.
And if you are interested in the most frequent words in the titles of the songs I’ve listened to this year…
Now you can officially call me “basic.”
The Twitter Diet
Lastly, I need to point out a truly novel finding from my data. You can drop weight by tweeting less! Here is my irrefutable evidence:
The correlation certainly appears to be there. Just make sure to consult your doctor before starting this diet. (Although I can’t imagine any doctor would say no to spending less time actively using social media.)
There may be another less useful diet: the No-Competent-U.S.-Leadership Diet. However, I actually think the causation falls the other way with this. I believe my weight (or potentially my tweet count) determines the proficiency of the White House.
Looks like I’m going to have do a lot of eating (or tweeting) in the coming days to ensure a successful transition of power.
And as 2021 moves along, let’s hope for increased travel, streams and steps. I truly look forward to measuring the improvement from 2020.
Personal data collected from Google, Apple, Strava, Spotify and Twitter.
Visualizations created with Tableau.
Check out my github for the code behind each post.