One of my friends was interested in trying to map out the number of buses needed in order to get to Facebook in Seattle. He was mentioning that he had a problem trying to figure out what was water and what was land using the Google Maps API. After some discussion of possible solutions, searching online, and viewing his prototype, I figured this would be an interesting problem to try to solve. Besides, how hard would it be to 1) determine water and land on Google Maps using their API, 2) determine the number of buses needed to reach Facebook across the map, and 3) display the data in a pleasing way?
Make a visually pleasing map that shows the number of buses needed to travel to the Facebook office from any location in the greater Seattle area.
Problem 1a: What is water?
I tackled the first problem: given a random GPS point in the Seattle area, can I determine that it’s over water or land? You would think the provided Map API will give enough information, but this turns out to be a hard problem using only their endpoints. The first solution would be to use their reverse geocoding to determine what the type is at that coordinate as it returns an array that says land, sea, ocean, lake, etc. However, it turns out this type is unreliable and many times, when over water, it returns an incorrect type. Trying to determine it based on elevation also didn’t work as the elevation varied too much across both water and land.
There was another suggestion of doing image processing, where you take a static map and go through the pixels to determine if it’s water based on the pixel color. At first glance, this seems like an difficult way of doing it as now you need to figure out which shades of blue is water, but Google Maps has a style API which allows you to customize the colors on the map for different features. This seemed like the way to go.
Doing some more searching, I found the exact style needed to get the result I wanted, which generated the map below (same as the link above).
Opening this map in python and checking a few pixels, this indexed PNG gave a value of 0 for the black areas (water) and some value between 1–255 for the rest, which was perfect. Now I know exactly what is water and what is land. But now I have another problem.
Problem 1b: What is the GPS coordinate of a pixel?
Luckily for me, some other people doing research on this figured out the math to it. In this StackOverflow question, there’s a python function that did exactly what I needed for the static image I had. The max resolution of a static Google Map is 640px x 640px (for a free account), and knowing the zoom level and the center coordinate of the image, I just had to plug in the the pixel coordinate to get a GPS coordinate.
Would the distance between a pixel be granular enough for what I was trying to do? Since I know the GPS coordinates of any pixel, I did some math to determine that the distance between two pixels is about 169 ft (51.5 m), which is more detail than I thought. In the end, I chose to use a 5 pixel distance (845 ft/258 m) between each location to check.
Problem 2: How many buses does it take to get to the FB office?
I chose a 400px x 400px area on the map to start mapping out the number of buses. Using the Directions API, I just plopped down the starting GPS coordinate and the FB office as the destination, choose transit as the option, and got a JSON result that contained a list of steps with the number of TRANSIT types and the total duration to travel there.
However, because I’m using 5px increments, that means I would need to make at most 6,400 API calls to get the number of buses needed at each of those locations. Using a free API key, the quota is 2,500 API calls per day. As I wasn’t in a rush to get this done, I wrote the program to do as much as it could, recording the results per column of pixels in a file, and doing something else as I waited for the quota to reset.
During my testing, I found out I needed to give a specific time for the directions API as the time of day influences the number buses available. Running my script at 1 AM led to several results saying it was faster to walk 2.5 miles because there were no buses running at the time (though thinking about it, walking/biking 2.5 miles wouldn’t be too bad). I set the time to December 6 at 8 AM to get a rough idea on transit to the office during work time.
Problem 3: How do I display this data in a pleasing way?
Tweaking the radius until it looked good visually, it was a pretty good draft and gave a great overview of travel times at a glance. There was a problem with what was shown though.
If you take a look at the small section in Queen Anne, there appears to be data saying it takes 3 buses to get to Facebook, except the office is really close to there. Why is this the case? At first, I thought it was because the dots were merging together to show an incorrect result, and doing a quick spot check in Google Maps showed that it’s supposed to be 2 buses. However if you click on the transit option to give more details, you see this:
I didn’t know there were bus lines that transformed into another number when it reached one of the stations. This explains why it showed 3 buses as it considered this line as two buses. The API unfortunately doesn’t have any indication that it’s the same line and doesn’t contain the text “continue on the same vehicle,” so with a bit of brainstorming, I made a small modification. If there’s two transit steps and the arrival time of the previous matches the departure time of the next step, then it’s the same line. Generally, if you have to transfer to another bus/train, there’s a small walk step in between or the arrival/departure times will differ slightly. I reran the script and the results looked much better.
There’s also a few spots on the map where the Directions API didn’t give any results, so scattered throughout are blank spots. To make the data look cleaner, I wrote a second part that looped through the map again and compared it against the JSON output. If there was a part that was land, but didn’t appear in the JSON, I copied the previous value. A median filter would probably work better, but this was good enough. With a small change to the color gradient, I now have the final heatmap.
I also made a choropleth map version that allows you to switch between the number of transit needed and the travel time. It’s not as smooth since it uses square blocks for each point, but the colors are more distinct. It also allows you to switch between the number of buses and the travel time.
In total, I made 11,846 API calls to the Directions API, which required 4 days since I waited a day for the free 2,500 limit to reset (because I’m cheap). It was a fun project to learn about Google Map API and how to display different data on it.