Intelligent Pothole Detection

We held our breath as we made the left turn onto Neville Street, just minutes away from Carnegie Mellon University. A collective silence filled the car, and the only thing you could hear was the high-pitched hum of the Prius engine. Four month’s work would be tested in the next few seconds, as we drove over a brutal stretch of potholes. Our eyes were fixated on the iPhone mounted on the windshield of the car, which would (hopefully) alert us if it detected a pothole. As we drove over the line of potholes, the car shook violently.

And sure enough, within a second, the iPhone app sounded its pothole alarm. The three of us exploded in celebration. We had built an intelligent system for road condition assessment.

From Frustration to Inspiration

Umang’s tire after the incident

On a daily basis, average Americans don’t think much about potholes. We have grown accustomed to avoiding them and are slightly annoyed when we mistakenly run over them. That changed for us when one of our team members, Umang, was driving home for summer break from Pittsburgh to New Jersey. It was a scorching 90 degrees. The PA Turnpike’s tar was even hotter. With cars whizzing down the highway at 80+ miles per hour, any road deformity was sure to be dangerous. When Umang ran over a cavernous pothole in the middle lane of the freeway, the car’s tire burst due to impact, speed, and road surface temperature. Thankfully, everyone inside the car was okay.

Inspired by this incident, we got together and looked further into the matter. We realized that poor road conditions are more than just a public nuisance; they cause discomfort to passengers, as well as damage to vehicles and tragic accidents. In the U.S., road-related conditions account for 22,000 of the 42,000 traffic fatalities each year. Besides this tragic cost to human life, damage to vehicles from potholes costs Americans $3 billion a year to fix.

We endure and complain about bad roads all the time, but have no way to detect or report them at scale. Meanwhile, civic authorities are not always aware of present road conditions, and road repairs happen infrequently. As a result, we, as citizens, are left helpless.

Rather than complaining any further, we took matters into our own hands. As a data science research project advised by Professor Zico Kolter, we decided to build a solution: a scalable system to detect potholes and assess road conditions in real-time.

Scoping the Problem

Road assessment and repair is a complicated business involving interactions between city residents, public works departments (PWDs), and private contractors. We couldn’t possibly address all the problems in this space, so we decided to focus solely on road quality assessment. If our system could accurately assess road quality and relay that information, PWDs could devote their time and resources into fixing bad roads instead of identifying them.

Specifically, we scoped the assessment problem down into two sub-problems. First, we would perform road condition classification, differentiating good roads from bad roads. Second, we would do pothole detection, identifying severe or sudden pothole events in otherwise normal roads.

The two problems we wanted to solve

The smartphones in our pockets already have sensors that can help us understand and classify road quality. So, we set out to build a system that leverages smartphone sensors to crowdsource data on public road experience and inform civic authorities about real-time road conditions.

Training the System: A Troubled Start

To classify potholes and road conditions, we needed to build a machine learning system that could ingest smartphone sensor data and produce classifications instantaneously. As with any supervised machine learning system, we needed to train our classification model with labeled training data. Lots of it.

We built two iOS applications for collecting training data. One would run on an iPhone affixed to the windshield of the car and collect sensor data including accelerometer, gyroscope, and speed. The other would run on a phone held by a passenger, who would manually annotate the potholes.

We could then stitch together the data from these apps to create a training data set. This data set could train our model to learn the relationship between sensor measurements and potholes/road conditions. After learning these relationships, the model would be able to classify road conditions given new accelerometer and gyroscope readings.

We chose to use the iPhone’s accelerometer and gyroscope because, combined, the sensors offer a rich array of information to quantify the movement of a device. An accelerometer measures linear acceleration while a gyroscope measures angular velocity and captures information about the rotational orientation of a device.

Accelerometer vs. gyroscope. Source:

But, training our system was an arduous process, and we faced multiple setbacks along the way. None of us were expert iOS developers, so our apps often failed. Sometimes, our app would crash after finishing a data collection trip, and we would lose precious time and data. After several iterations (and countless bone-jarring potholes), we finally got our training process to work and managed to collect useful data.

Results and Impact

Here are two plots visualizing points from the first three principal components of the gyroscope, accelerometer, and speed attributes, colored by their labels (good road/bad road and pothole/non-pothole).

There is clear linear separation between the classes in both classification problems, granted some overlap between the classes. This indicates that our sensors are useful in classifying potholes and road conditions.

After training a support vector machine (SVM) for each problem, we evaluated the classifier’s performance on a held out test set. On the test set, we were able to classify potholes with 93% accuracy (compared to a 90% base rate). We classified road conditions with 94% accuracy (compared to a 82% base rate). We tuned our pothole classification model to achieve a recall of 0.42, meaning that the model could detect 42% of all actual potholes.

Once we had built robust models for the classification tasks, we developed a third iPhone app that does real-time classification. This app collects data from the phone’s sensors and uses our pre-trained classification models to detect potholes and assess road conditions in the real-world. The app displays classification results (good road/bad road, pothole/non-pothole) in 5-second intervals as you drive.

Real-time classification app

Using classification results from devices running the application, we can produce data-rich maps of the city colored with potholes and road conditions. Here is a map we produced of Pittsburgh road conditions from our trial runs with the app.

Road conditions map of Pittsburgh, PA produced by our system

Imagine these maps and data in the hands of public officials who can put these insights to good use. They can use our system to understand real-time road conditions and direct repairs to areas in need. The data itself can be crowdsourced from vehicles all over the city (think garbage trucks, Lyfts, Ubers, etc). This collaboration between the consumers and caretakers of road infrastructure could improve the delivery of this key public service.

Looking Ahead: Bringing our Research to the World

What started out as a pet project has evolved into something much larger. Our research paper has been accepted for publication at the Bloomberg Data for Good Exchange in New York City. We have also been invited to present our work at the UChicago Data Science for Social Good conference. We will attend both events later this month. You can read our research paper here.

We want our technology to truly improve road conditions and enhance people’s lives; we cannot just keep this innovation and potential for social good to ourselves. So, we are spinning off our project into a startup, Percepsense. At Percepsense, we will develop scalable solutions that use crowdsourced data to help people and organizations make informed decisions. We have received an initial grant of $2,500 from the NSF I-Corps program to start our venture.

The journey has just begun. We can’t wait for the bumps along the way.

Our Team

We are a team of three Carnegie Mellon University students passionate about building data science products to solve human problems.

Our team with the car we used for training and testing the system.

Team members (left to right): Shouvik Mani, Umang Bhatt, Edgar Xi.

Have questions or ideas? We would love to hear from you! Reach out to us at