Visualising on-street parking in Melbourne with open data

Recently, the Eliiza team decided we wanted to wrangle some large, local, geospatial data to build something insightful, useful & pretty.

You can play around with what we ended up building here https://demos.eliiza.ai/melb-parking/

We saw this as a good opportunity to leverage:

We were keen to build something we could dogfood.

Oh, and did I mention… we wanted to do all of this in a week.

The Eliiza team eating up the end product

Getting started

We decided to loosely follow Andrew Ng’s iterative machine learning life cycle of Idea-Code-Experiment.

We are here

Idea

We explored the open data available on the City of Melbourne’s portal and settled on the parking sensor dataset since it was comprehensive and met our requirements.

After briefing our loved ones, who confirmed there was an appetite for a parking aide, it was full steam ahead! This was a pleasant surprise since most of us expected a glazed look / yawn, see below.

Parking data is SAH boring

The value we chose to predict and visualise was the probability of a parking bay being occupied given the day-of-week and hour-of-day.

Hang on a second, isn’t that just a convoluted way of saying the average?

Yep, that’s right, as per the machine learning life cycle we wanted to build and train a basic system quickly. From there, we rapidly experiment and iterate.

Code

34GB of data was recorded from almost 20,000 sensors between 2011–2016. We streamed this data directly from the data portal to a Google Storage bucket in the cloud.

From there, we wrangled the data using BigQuery, pumping out some hefty SQL queries, quickly and cheaply. This is not something we would have been able to achieve on our laptops.

(Watch this space for a more in-depth blog on the work we did in BigQuery)

After spending some time understanding and verifying the data, it was time to build and train our first basic model, ensuring we had accounted for any anomalies that had surfaced during the discovery process. The anomalies consisted of the usual culprits — date & time formatting and missing data.

The results of our basic model were then measured against randomly chosen unaggregated data using the Brier score and…

The model performed better than chance!

Our model’s Brier score was 0.184, successfully beating a chance Brier score of 0.25 which assumes 50% occupancy. Great result, but clearly more work to do.

Experiment

Using Mapbox’s data-driven styling we assigned a colour to each parking bay throughout Melbourne based on the occupancy rate from the model. The colours range from red — “probably occupied” through to green — “probably available”. Select a different day/time and the map updates the occupancy colours accordingly.

Check it out here.

Queen Vic Markets at 4am, 2pm and 7pm on Saturdays

Not surprisingly, the map suggests staying away from the CBD during business hours and the Queen Victoria Markets on the weekend, assuming your goal is finding a park. Although not super insightful, this gave us confidence that our model was behaving correctly.

(Watch this space for a more in-depth blog on the work we did in Mapbox)

Next steps

  • add more features… holidays, events and weather
  • use the live data stream from the sensors to improve the model
  • feed the data into more sophisticated models… linear regression and XG Boost
  • get feedback from the users
  • roll out to different cities

Takeaways

  • lots of valuable open data waiting to be explored
  • the tools required are available at a reasonable cost if not for free
  • there’s an appetite for a parking aide
  • it’s possible to ship an idea within a week