Satellite-based Ground Truth for Parking Availability
Predicting on-street parking space occupancy is very much like weather forecasting. Except that you do not know whether in the end it actually rained or not. So how could you tell if your forecast is any good? Well, that’s what I will explain further.
Parking “situation” in a city has two dimensions: spatial and temporal, e.g., it changes over time and it is different for different neighborhoods and streets. Obvious! But that is the core problem. Imagine we build a service that tells a driver how tough is it to find a parking spot on a specific street at a given time of the day, and we want to make sure that it gives them a reasonable estimate. In order to do that we have to have a way of knowing the actual truth for many streets spread out through the whole city with repeated measurements over time. Performing such quality assurance in 100 cities (which would barely cover US and Europe baseline) is economically impossible. So, here is what we do.
At Parkbob we use satellite and aerial imagery to validate the quality of a parking availability service. While this does not get us a 24/7 coverage, we can observe parking situation in the whole city, instead of covering just a few streets. And this solution easily scales to 100, 1000 or even more cities. As an example, here is a map of Hamburg colored by on-street parking availability (ratio of free parking spots to total spots) in percentage points. The availability is extracted from multiple satellite images captured in the fall of 2017.
Colored streets have availability extracted from at least one satellite image, grey streets are streets where parking is not allowed, and the rest is, well, it’s complicated.
There are quite a few reasons behind the uncolored parts on the map above. Let me guide you through the process of getting the spots out of the raw satellite imagery and explain what exactly is so tricky here.
Remember, I said, we can observe the parking situation in the whole city. Well, not quite. There are two major not-fixable problems: trees and satellite positioning.
This is a perfectly valid segment of a street on a perfectly valid satellite image. According to our parking regulations database the right side of the street is mostly parkable, but can you spot any cars?
Here we can objectively judge the occupancy only on a part of the street. This is a common problem with satellite imagery: not all streets are exactly underneath the satellite’s trajectory and therefore, are captured from an angle.
Maps, Lanes, and Free Spots
Once we filtered out all unsuitable streets and ran the imagery through the car detection algorithm, there’s still a lot to do. Let’s look at an example. On the image below you can see a street segment in Hamburg with blue rectangles representing recognized cars.
Now, count the number of free parking spots you see on the image. How many did you find? The correct answer is: there is one. How so? Here is how we got this result.
You probably noticed that on the image above cars are not exactly parallel to the street. In fact, when we add the centerline to the picture, it looks quite wrong.
The above example is exaggerated to better illustrate the problem. But an interesting thing is, that most of the map lines are slightly tilted. And the tilts are just the tip of the iceberg. The street can actually be a bit curved, have a widening due to a turn, with or without the extra lane replacing parking space. The cars might not be perfectly aligned at the exact moment when the satellite image was captured (someone was just parking out, changing lanes etc). The solution in such cases becomes much less obvious than just fixing a tilt. However, we can fix the above example rather easily. We will assume that there are three lanes, and that they are perfectly linear. The rest should be trivial, and you can see the result on the next picture.
Parking vs Driving
When you were counting spots before, you subconsciously had to perform the separation of cars and respective lanes into parking and driving. In our example it is quite simple. If we are only interested in parking, we need to focus on the top and bottom lanes, and can safely remove the middle one. However, if the top lane was not visible because of a tall building in combination with imperfect satellite positioning and the driving lane was packed with cars piling up before the red light, the separation would become quite tricky.
Rules and Spots
Now that we know, that we are only interested in the top and bottom lanes, and can position cars on them, we can finally count free spots. Let’s look at the picture again:
Still don’t see why there is only one free parking spot? That is where you need our secret sauce: parking rules & restrictions. Let me color the lanes by parking/no-parking for you.
The no-parking part next to the intersection on the right is slightly larger than on the left due to a pedestrian crossing. In fact, there is one car being parked illegally. Which is another interesting indicator of the parking situation. Both gaps between parked cars on the top lane are entry ways, and are therefore kept free. So, the only free parking spot is the one in the middle of the bottom lane.
How to evaluate a model
Now that we got the actual count of free and occupied parking spots on the street (ground truth), we can get back to the original question: how good is our parking availability service? A baseline of such service would be a statistical model that, given a proxy data source, predicts percentage of free & occupied parking spots on a street segment. Knowing the model’s prediction and the actual availability of a street, we are able to compare them and tell how good does our model perform.
Let’s look at an example. In the table below are model’s prediction for a few streets and the corresponding ground truth values. The measurement here is availability: a percentage of free parking spots on a street.
It seems that our model performs fairly well — its prediction is close to the actual value. But that’s not only what we are after. We want to build a useful service, and let’s be honest, you don’t need a parking availability service, if on average a third or more of parking spots are free. So, the only part we are actually interested in is this:
Here the model is still quite close to the reality; however, there is one tiny but very important detail. Let’s imagine this model would be used to choose between two streets on which to look for a parking spot. Consider two scenarios: medium availability (green) and low availability (yellow). In the medium scenario the model fails to distinguish between the two streets, even though one of them is actually less busy. In fact, for a short street, 15% could be just one or two spots that may get occupied by the time driver arrives there. In the low availability scenario, there is only one street that still has some free parking spots, but according to the model, the driver should look for a parking spot on the other street, where there is none. So, using this model in both cases is counterproductive, even though the model is generally rather correct.
When evaluating such model, how can we both account for correctness and usefulness? With the above consideration, we might want to focus the evaluation of cases where the availability is low, and put an extra penalty on cases where predicted availability is higher than the actual availability. Using such a customized evaluation approach in combination with multiple satellite images of the same city, we can choose the best modeling strategy and/or parameterization in terms of end-user experience.
Parking availability ground truth is a tricky beast. But there is no way around it if we want to build a useful parking availability service. Satellite imagery seems to be an obvious choice, however it bears a lot of challenges in the extraction process that go beyond image quality and car detection. Nonetheless, for us it is the main source of ground truth, as it allows us to validate modelling approaches across multiple cities all over the globe regardless of their size, language spoken, or local privacy regulations.
Disclaimer: Part of the data does not reflect the reality and was generated only for demonstration purposes. Algorithm results are stylized for the same reason.