Mapping the visual reality of our urban homes

By Nikhil Naik, Camera Culture

What makes a neighborhood look safe, lively, or depressing? And how does a city’s appearance affect the health and behavior of its residents? We can’t answer these questions with census data alone; we need quantitative tools that can measure the visual appearance of a city. StreetScore, a project by the Camera Culture and Macro Connections groups at the Media Lab, is a tool that measures how safe a street view looks to a human — but using a computer.

By using StreetScore to measure the visual perception of safety of hundreds of thousands of street views from Google Maps, we can generate high-resolution maps which show how safe different neighborhoods of a city appear. These data can help us explore important connections between the appearance of urban environments, crime, and health. Dig a little deeper, and these visualizations can give us valuable information about the history of a city’s architecture, planning, and policies.

How StreetScore Works

StreetScore is a machine learning algorithm. It learned to predict how safe an image looks by using many example images labeled by people as safe or unsafe. This data consisted of 3,000 street views from New York and Boston, obtained from another Media Lab project called Place Pulse, launched by the Macro Connections group in 2011.

From Place Pulse to StreetScore

Place Pulse can be thought of as a “hot or not” game for street views: people are presented with a pair of images and asked to click on the one they think looks safer, or more lively, or more beautiful. The throughput of Place Pulse, however, is limited. Even though online participation can give us hundreds of thousands of clicks, this throughput does not allow us to scale to millions of images. To scale we need train a computer to predict how a human will perceive a picture of a street. So how do we train a computer to do this?

To begin the training, we first needed to convert the pairwise comparisons from 3,000 images scored by Place Pulse to image rankings, in much the same way golf players are arranged on a leaderboard as first, second, and so on. We created this leaderboard using the Microsoft TrueSkill ranking algorithm, and scored each image on a scale from 0 to 10. Next, we used “image features” to describe these images. An image feature is an abstract representation of an image that encodes information on the colors, textures, and shapes present in the image. In other words, image features try to quantify things like the greenness of the grass, roughness of the bricks, and pointiness of the roof.

Using the 3,000 examples of street views from PlacePulse, we developed a predictor, or a set of rules, that quantify the contribution of different image features on the perception of safety. We called this trained predictor “StreetScore.” The training enables StreetScore to evaluate new images. It extracts image features from an image and assigns the complete image a score based on the associations between features and scores learned from the training dataset. In our paper, we show that StreetScore does a good job of predicting the perceived safety of the areas shown in the new images, meaning that we can use it to generate relatively accurate maps of the appearance of different cities.

From Algorithm to Visualizations

Once we had a working algorithm, the next step was to create interactive map visualizations based on the thousands of street views evaluated by StreetScore. Jade Philipoom, an MIT undergrad studying electrical engineering and computer science, joined the project as an undergraduate researcher (UROP) to work on the visualizations. She did a great job setting up the pipeline for generating the maps—work that allowed the team to look at different visualizations and make important design decisions. She was then able to incorporate these changes and create the beautiful maps you now see on the website.

Key Insights and Next Steps

This project is at the intersection of a few disparate research areas, namely computer vision, urban science, and data visualization, and we gained key insights in these different fields while working on StreetScore.

On the urban planning side, we want to follow up on a few other ideas that came up. First, we are planning to use StreetScore to study how neighborhoods evolve over time using street views captured at different times. Second, we want to use StreetScore to study the segregation of urban environments due to different urban planning decisions. We’ll post examples of our findings on the StreetScore website. Our data are also being used by different urban planning and economics researchers studying similar questions.

From a computer-vision perspective, it was great to see that there is potential to aggregate information obtained by an algorithm from millions of images to generate meaningful insights about this image collection. As big visual data becomes common in social media, urban planning, public health, entertainment, and many other fields, computer-vision algorithms that can synthesize information from troves of visual data will become important. We are excited to work on algorithms and applications for this fast-developing field. We’re also very interested to see new applications of StreetScore and follow-up projects by researchers, companies, and other users.