How to use vehicle sensors to make cities more sustainable
A case study on making cities greener by vegetation monitoring and detecting traffic density; differentiating between heavy vehicles, buses and private transport.
We have thousands of sensors driving through cities, how can we use them ‘for Good’?
At the birth of the AI for Greener Cities Challenge, there was a fleet of delivery vehicles driving around European cities. DPD, its owner, intended to gather sensor data to plan better delivery routes that avoid possible traffic congestion. The initial goal was to save time and reduce carbon dioxide emissions.
But maybe, there was more use to sensor-equipped cars. For example, why not create helpful open source solutions for cities to become more sustainable?
DPD Netherlands teamed up with Jheronimus Academy of Data Science and FruitPunch AI to assemble a team of data scientists and AI enthusiasts to come up with possible useful applications; and have some fun with machine learning while doing it.
We, the AI for Greener Cities engineers, accepted the Challenge. Our team was split into 2 groups focusing on the most promising avenues: traffic detection and vegetation monitoring.
Detecting traffic density with YOLO
Team BusAround took on the goal to detect the traffic density caused by buses and other heavy vehicles and:
- Help create a system of planning routes for municipal vehicles to avoid possible traffic congestion to save time, fuel and reduce emissions
- Visualize the ratio of buses versus private transport; providing public services with insights where traffic congestion can be decreased by expanding public transport routes.
“With a background in business strategy I’ve learnt the importance of focus, focus, focus. During this 10-week data science project I discovered its equivalent: scoping, scoping, scoping. We had high ambitions, but realized early on that the given time and resources were limited. The discussion turned from “what is interesting to do?” to “what can we do?” Antoine Miltenburg, AI for Greener Cities Engineer
The aim was to define a model to indicate the ratio between buses and private vehicles, between heavy and light vehicles, and traffic density in any given street. This model should work with data from video recordings from the dashcams or other cameras installed on the DPD delivery vans and other vehicles (for example a garbage collection truck) with similar sensor set-up. This is schematically shown here:
Our intended result of the proof of concept was a data visualization pipeline. This should provide insights on how the model’s output was used to understand the data. Then some insights from the data could be generated using a deep learning model.
Open-source training dataset for reusable outcomes
The team used the Berkley Driving Dataset: BDD100K to develop a first model. Training with this dataset offers a good first approximation of real-life scenarios for the future model for several reasons:
- This dataset contains 100K videos collected from more than 50K rides in New York, San Francisco Bay Area and other urban areas. Each video is 40 seconds long with 16.5K, 10K, 6K, 4K, 179 labeled instances of bus, bike, rider, motor, train and other objects.
- Furthermore these videos contain a variety of scenes, like downtown streets, residential areas and highways.
- The videos were recorded in various weather conditions, and different times of the day.
- Last but not least, training with open source data allows for repeatable use of our results in the future.
For the scope of this Challenge, the videos recorded in downtown New York have been selected.
Next step was the model selection. The choice for a model, already pre-trained on the BDD100K dataset, was based on testing three alternatives:
- Faster RCNN model — did not show good results indicating buses
- ConvNext — showed decent results indicating buses, but still not accurate enough
- YOLOv5 — had good results indicating buses
We decided to go with YOLOv5 and train it on more than the BDD100K, we added the COCO. YOLOv5 trained on this COCO dataset gave good results indicating buses.
We decided to pitch the YOLOv5 model against YOLOv7. Version 5 performed better:
Due to time constraints, we made use of available models without fine tuning. The outcome of this project is therefore not ready for real world implementation, yet. What it does is give direction to where data science could help improve the efficiency of driving routes.
Our team believes that using observations from video recordings will contribute to better data about traffic congestion, availability of public transport (and shared mobility) in certain streets at certain times of the day.
Automated urban re-greening suggestion system
Team Green Vegetation went on to develop a computer vision based automated urban re-greening suggestion system. The system is able to analyze video footage from cameras mounted on trucks; giving insights into places that need more plants and trees via three metrics to score the city on its greenness.
We looked at data and models that have previously been created to get a better understanding of what needed to be done. Our workflow was planned around achieving one task per week, with weekly meetings for everyone to rendezvous and troubleshoot any issues as well as setup the tasks for the following week.
Our goal was to create and compare several models that can be used to estimate two things:
- The amount of greenery in an image
- The areas that could potentially become green spaces
Metrics for measuring greenery
Since driving data mostly consisted of urban areas images, potential green areas for cityscapes were identified using three metrics:
- NDVI: Normalized Difference Vegetation Index
- GVI: Green Vegetation Index
- VSD: Vegetation Structural Density
NDVI is a metric typically used in satellite imagery to measure the amount of greenery in an image. It does so by computing the number of green pixels in an image compared to the amount of non-green pixels. NVDI is typically used in the ecological field to compute greenery over large spaces in real time. Although NDVI is widely used and useful, it is often only used on satellite images. We needed to apply another metric to street view images.
GVI is used to compute the amount of greenery in each image. It can be applied to output driving datasets. Although less known, it was a bit more suited to our project compared to NDVI. GVI computes the ratio of green vs non-green pixels on an image. GVI was one of the main metrics used in the project. NDVI was used to verify the GVI results.
The last metrics we used was VSD. This showed us the proportion of any vegetation type relative to other vegetation (grasses to shrubs, etc.). VSD is incredibly useful since some plants can absorb more CO2 than others, such as C3 trees compared to C4 shrubs and grasses. This means VSD can be used to maximize the amount of CO2 absorbed by green spaces by showing areas where more carbon efficient vegetation can be planted.
Modeling & Results
“One of our goals was to identify spots where greenery could be increased. We needed to understand which objects in an image are essential, which are already plants, and which can be removed to make green spaces.” Qiulin Li, AI for Greener Cities engineer
The first step was to use a segmentation model to identify objects such as roads and trees. We used three segmentation models to sort objects in an image into various categories. The models were trained on the BDD100K and Cityscapes Dataset, both of which are open source.
We picked 3 segmentation models to train.
- The first model was MetaAI’s Detectron2. We successfully trained the model using sample scripts and computed the VSD and GVI of the BDD100K dataset. The GVI of the BDD100K test set is 0.14232 which indicates a low-level of greenery. We also calculated the amount of potential green space relative to the amount that is currently green. Only 0.37612 of all potential green space is currently green.
- The second model was the Mobilenet + Mask-RCNN, otherwise known as low-latency region-based Convolutional Neural Network. This model was pretrained on the Cityscapes dataset. We applied the model to data specific to Munich and then computed the VSD and GVI of the results. The netl GVI of the dataset is 0.174 and total VSD index is 0.033, these metrics can be used to assess the level of greenery within a city and improvements can be made based on these metrics.
- Segformers were also considered but set aside due to their high complexity and slow inference time.
The most important suggestion we have for the future users of these models is to measure urban greenery using all three metrics — GVI, VSD and NDVI. The indicators combined in a calculation with different weights can give a comprehensive greenery quality score. It is essential for municipalities to know the actual balance of the city’s vegetation to take actions.
The AI for Greener Cities Challenge has been an educational and impactful experience for all of us. With the literature and practical activities, we deepened our AI knowledge and improved our teamwork skills since we worked online in teams from four different continents.
Antoine Miltenburg, Jari Gabriëls, Qiulin Li
AI for Greener Cities Engineers
Team Bus Around Sahil Chachra, Resham Sundar, Shubham Baid, Giuseppina Schiavone, Luca Simonetti, Antoine Miltenburg
Team Green Vegetation Jari Gabriëls, Qiulin Li, Ha Trinh, Claudia Flores-Saviaga, Bruhanth Mallik, Animesh Maheshwari, Alexandre Capt
Head to our website to learn more about current challenges and learn to apply AI in the real world.