Reducing MBTA’s monthly Google API bill from $15,000 to $1,500

Lev Boyarskiy
3 min readMar 26, 2021

--

Photo by Aaron Doucett on Unsplash

It’s no surprise that working in transit technology often involves presenting riders with accurate real-time and schedule information alongside interactive maps. MBTA.com gets up to 400,000 visits per day, primarily for schedule, route, and station pages which all prominently feature a map — which means we’re serving millions of maps each month!

Up until the Summer of 2018, Google Maps was pretty much the standard for high-quality interactive map experiences and the ubiquitous choice throughout the industry. But things changes when Google raised its prices by more than 1400%. Facing the reality of paying 14X more for a map without any apparent improvement in functionality seemed wrong. Thanks to Google, we now had a mandate to evaluate alternatives and hopefully save a few hundred thousand dollars along the way.

Our primary goal in MBTA’s Customer Technology Department is to serve the riders. One of the ways we do this is by eliminating unnecessary technology spending.

After some brainstorming, the solution was born:

Here is how it works:

  1. We download the map data from the OpenStreetMap for the regions our agency serves — Massachusetts, Rhode Island, and New Hampshire.
  2. We import this data into a PostgreSQL database and merge it with our GTFS data to display transit features on the maps like stops and (potentially) routes.
  3. We apply the visual styles selected by our award-winning design team.
  4. We export the result as a set of PNG files and store them in an AWS S3 bucket.
  5. mbta.com website uses Leaflet, a JavaScript framework, to display the generated map images to the visitors in an interactive format.

At the time of writing, we have around 4 million PNG map “tile” images (3,780,455 to be precise), which require around 5.2GB of storage. This is not a huge amount of data by the modern-day standard. Still, it does need significant computing power to be generated. While this data is stored and accessed as static files, it still requires to be periodically regenerated. The regions and our transit data change, and these changes must be reflected in our maps.

That is why we decided to leverage AWS Batch to simultaneously launch eight m5d.large spot machines, each of which independently generates the map tiles for its part of the map. We divide our service area into “horizontal” (East-West) stripes. Then each of the eight machines generates the map tiles for the assigned piece. This way, the process takes around 4 hours to complete. We run it twice per month because maps don’t update that often. With average spot price for an md5.large machine being around $0.035 per hour, this process costs us around $2.25 monthly, less than a single subway fare.

Switching to the new approach reduced our monthly Google API costs from around $15,000 to $1,500 (we still use their Places API). In addition to reducing the costs, we also reduced our dependency on the proprietary software — and while we’re still using some of it (right, we’ve switched from Google to Amazon), there is no longer a vendor lock-in. Thanks to using Docker, we can run the same tile generation process through any other cloud provider or even locally, as we did during the development. Despite of being stored in AWS S3, the PNG files get accessed through HTTPS, so they can also be stored in any other CDN system as well. This migration would be transparent for the frontend as long as the folder structure is preserved.

Last but not least, as it is customary in CTD, the results of this work are freely available to the public and to anyone who wants to reduce their Google API bill: https://github.com/mbta/tile-server

--

--

Lev Boyarskiy
Lev Boyarskiy

Written by Lev Boyarskiy

Victorious warriors win first and then go to war, while defeated warriors go to war first and then seek to win

Responses (1)