Automating Mundane Web-Based Tasks With Selenium and Heroku
Automating away those unceasing repetitive tasks the easy way
The Task At Hand
Being the avid runner and data lover that I am, naturally, I like to get the best out of logging my training. The solution I settled with a few years back was to log my training on both Strava, and Attackpoint. While both platforms provide a service for tracking exercise using GPS data, Strava has an emphasis on being a social network and it allows you to look into each activity in an in-depth way, something that Attackpoint lacks; Attackpoint on the other hand is more barebone and I personally prefer it for looking at my training as a whole (in timescales of weeks/months/years).
The one problem I have, however, is that I always struggle to keep both of these logs up to date. Garmin (the maker of my running watch) syncs up with both websites, via an API, so my runs magically appear on both websites, so the problem isn't that my runs are only appearing on one website. The problem I have is that keeping both websites up to date with run notes/descriptions is a tedious affair. I tend to always keep Strava up to date and dip in and out of posting on Attackpoint whenever I can be bothered. So why not just automate away my commitment issues to these websites?
The task: every time I run and write about it on Strava, take that description and add it to the corresponding run post on Attackpoint.
The Game Plan
So I had a why, all that was left was to figure out how. Choosing to go from Strava to Attackpoint, and not the other way around, was an easy one. Firstly, Strava is way nicer to use, and secondly, the website is more modern, slick, and fancy so writing code to post and interact with it, although doable, would be more of a challenge. So how did I actually do this?
Enter Selenium. Selenium is a Python package that launches and controls a web browser. It is able to fill in forms and simulate mouse clicks in this browser. An excellent tutorial on Selenium, which I used a lot, can be found here: https://towardsdatascience.com/controlling-the-web-with-python-6fceb22c5f08.
With these kinds of things, I find that it's always good to write down a plan of action; whether that be on paper or on your computer, it's totally down to personal preference, I tend to go paper. More often than not I end up deviating substantially from the initial plan when things inevitably go wrong but either way, I find writing down initial ideas/thoughts provides clarity and gives me a sense of direction.
So here is a rough outline of how I chose to automate this task:
- Make a call to the Strava API to get an ordered list of my recent run's unique IDs. This wasn't originally in the plan but obtaining my activities + the corresponding information in time order from Strava proved way harder than I initially thought due to added complications (for example, a lot of my runs group runs, and the html gets tricky here). Having these IDs made extracting the information was easier.
- Use Selenium to log in to Strava, navigate to the ‘Your Activities’ tab, retrieve the information for runs corresponding to each unique activity ID. As you can see in the code below, Selenium couldn't be easier to use as it's so intuitive. The code below is a rough skeleton, the full code + comments can be found on my Github page.
- Now we have all my most recent activities (in order of completion!) along with the activities details: time, title, description, distance, and pace. All that is left to do now is add the title and descriptions to my Attackpoint page, although there are a few minor subtleties. First of all, if any activities have descriptions already I want to leave them alone. Secondly, if any of the Strava titles are the default i.e. haven't been changed, I also want to leave the post alone as I haven't gotten around to name the Strava run yet. As with before, some code details have been left out for clarity.
And we are done, if I run attackpointSelenium.py in the terminal it will add any missing descriptions to my Attackpoint!
Making The Automatic Fully Automatic
Obviously running a line of code once or twice a day is WAY too much effort so the last thing that was left to do was deploy the code online somewhere so that I was completely out of the loop. Following this excellent tutorial I was able to deploy my code to Heroku and schedule it so that my code checks to see if my Attackpoint needs updating every 10 minutes! And best of all, it's completely free.
TL;DR wrote code to automatically post information from one website onto the corresponding section of another.
Thanks for reading and I hope you enjoyed it. Here are some of my other stories you may be interested in…