At Unit8, we always strive to find the shortest way to validate ideas and feasibility of our projects. Hackathons, thanks to their very short-time frame, are a great way to practice these skills. That is why a team of 4 at Unit8 (Adam, Krzysiek, Michal and me) decided to go and compete among more than 1300 coders in the biggest hackathon in Europe: Hackjunction.
Which problem are you solving?
Once there, we decided to start working on the challenge proposed by the Finnish city of Tampere. The goal was to develop and exploit real-time insight about the city traffic in order to provide better transportation options to city inhabitants and help them reduce their CO2 footprint.
After discussing with city officials, we realized there existed a network of CCTV camera installed in the city. Using modern computer-vision techniques, could we exploit that network of cameras in order to provide automated real-time insights?
Our first conviction is that navigation apps already provide really good itineraries and real-time redirection in case of traffic jams or construction work. However, we’ve all lived that frustration of spending more time looking for the parking spot that actually driving there. That is still an open problem that Google even acknowledged in one of their blogpost. Existing sensor-based solutions (e.g. using in-ground detectors) are expensive to install and operate. In addition, they don’t work well if covered with snow (like it happens often in Finland!). Could we do better and provide a cheap and flexible solution to the parking guidance problem to Tampere inhabitants?
Having settled on that problem, it was time to answer the hard questions: design and build a system that would reduce parking time and come up with solid evidence that such a system would work in the real world. In order to demonstrate the feasibility of our idea, we needed to focus on answering two questions:
- Can video cameras reliably replace sensors?
- How can Tampere inhabitants benefit from that new system?
How to validate your assumptions the most effectively?
With the omnipresence of LTE networks in city centers combined with the recent breakthroughs in computer vision, we were convinced that a solution using all connected camera could reveal much more competitive in term of cost and accuracy than traditional alternatives. How could we demonstrate that?
Test existing Object Detection Algorithms
Our first task was to put state-of-the-art object detection algorithm to the test. Would they provide a sufficient basis to develop a parking detection algorithm that would support the following constraints?
- Need to work with multiple vehicle types (cars, pick-up, trucks, ..)
- Need to work in challenging lighting/visibility conditions (as we were in Finland we really wanted a solution working under snowy conditions)
The simplest first step, we could take was to select a few representative pictures and check the algorithm output. Was the state-of-the-art sufficiently robust to handle all these cases?
In addition to pure detection accuracy, we also had to design a solution that would scale to many concurrent camera feeds. For these reasons, we quickly settled on an algorithm called YOLO. YOLO is a state-of-art deep learning algorithm for object detection. It allows to identify multiple objects in a given image but also to identify the size and location of these objects. Compared to other existing solution, it is both very fast and accurate. Having to work in real-time with potentially dozens of video feeds, the performance was a key criterion.
Following the website instructions, all we had to do was install the dependency:
Make the algorithm run on our image test set:
Finally, launch the script:
And there had our first results:
As you can, results were pretty impressing and gave us the conviction that the algorithm would be good enough to be used as a base for parking detection. We already validated some crucial step of our project without even any code!
Develop the Tracking and Parking Detection algorithm
Now that we were convinced by the robustness of the object detection algorithm, we needed to find a way to demonstrate we could develop an algorithm to detect parked cars. In order to achieve that, we would need two elements:
- The ability to run YOLO on video streams
- A way to distinguish parked cars from the other cars
After some googling, there already existed a few projects combining OpenCV (a computer vision library) and YOLO. They would allow us to solve problem #1. Once again, trying to re-use existing material as much as possible, we decided to start from the solution described here. We could now annotate each video frame and visualize the result.
Next step was tackling problem #2. From our test images, we noticed that we were only interested in a subset of all objects, namely cars, trucks, and buses. We would leave motorbikes out, as they have special parking spots. Transcribing that to code was easy using the class label provided by YOLO:
After running that code on several videos, we started to get a better feeling of how we could design our parking detection. There were 3 major cases, the algorithm would need to take into account:
- ✅ Parked cars: position = static and timeframe > several minutes
- ❌ Car moving on a road: position = moving
- ❌ Car stopped at traffic light: position= static and timeframe: typically < 1 minute
It seemed like the simplest version of the algorithm that would work is simply to select static cars over long enough time-periods!
So we needed a way to match bounding box between multiple frames and detect the ones repeating over a given period of time. Once again, we settled on starting with the most basic approach possible. Looking at all the previous bounding boxes, we would compare our current bounding box and see how many are matching the current one. If we observe the same bounding box over a set period of time (say 1 minute), then we would classify the object at that location as “parked”. That approach assumes that cars are not occluding each other which, in our case seemed like a reasonable constraint as most cameras would be positioned a few meters above the street.
In order to compare bounding boxes, we only compared their center and selected all the ones under some threshold growing with the size of the box:
Then all we had to do, was to count in how many frames we would see a car in a given position (accounting for some imprecision of YOLO using the coefficient `detection_ratio`)
And there we had our algorithm for parking detection. As you can see, even though very basic, the algorithm was proving already quite robust to night conditions, snow, and low-resolution images!
Prepare a convincing demo of the algorithm
One last point to address was our need to convince others that our algorithm was working. We could generate a video and show that. However, our demo would look much more convincing if we were running it on a live video stream where we would not control the events taking place (for example the live video showing the parking of a restaurant in Espoo).
The VideoCapture() method in OpenCV is already compatible with remote video streams. However, in that particular case, the video is coming from YouTube. The Youtube link above is not actually the path to the video and it not pointing to the video URL. Using the library Pafy, we were able to automatically retrieve the correct URL. With that in mind, we could write a function to return the proper OpenCV output (cap):
And that’s it we had our parking detection algorithm. The complete code for the parking detection demo can be found at:
Contribute to Digma/hackjunction-free-parking-spot-detector development by creating an account on GitHub.
Demonstrate you are solving a real-world problem!
After convincing ourselves of the performance of our the parking detection, the second priority was to show how that new information could improve the life of Tampere inhabitants (at least a bit 🙂). If we could integrate that potential new information into a guidance app (like Google Maps), Tampere inhabitants would be able to park in record time! Since Tampere already provides a mobile application to its population, it would be possible to add later on the functionality there.
We would just need to include the results provided by the outdoor CCTV camera feeds to the existing indoor parking API. In our case, using Firebase, we developed a basic API that would stream the results of the video-based parking spot detection to the mobile app
Querying both APIs (the existing ”Tampere indoor parking” & ours), our app was now capable of updating the final destination and redirecting the user to the nearest free parking spots automatically!
I won’t enter into details regarding the mobile app (as each will have their own favorite backend and mobile framework) but if you are curious, the complete code of the hackathon is available there:
Better way to manage parking spaces & navigation. Contribute to unit8co/hackjunction-parkmate development by creating…
Prepare the questions
And there we had it! After 2 days of hard work, we had a convincing demonstration of the complete solution. Of course, there were still quite a few open questions with our solutions:
- How to infer the total number of parking places available? (we are only counting parked cars)
- How to optimize the guidance to free parking places in scenarios with multiple cars?
- How many cameras would we need to cover a given area?
But, having tackled (according to us) the most challenging problems already, we were confident that such questions could be solved.
After pitching our projects, we ended up selected as one of the 2 winners of the Tampere challenge. A great ending after these 2 intense and fun days!
Special thanks to Michal, Krzysiek, and Adam!