Sonic Cities: Listening to Parks During Lockdown

Published in

Senseable City Lab

10 min readJul 17, 2020

Written by: Arianna Salazar Miranda, Dylan Halpern, and Fábio Duarte

“Wherever we are, what we hear is mostly noise. When we ignore it, it disturbs us. When we listen to it, we find it fascinating”. John Cage, 1938

Cities have changed profoundly as a result of the COVID-19 pandemic. Stay-at-home (SAH) policies have suddenly shut down schools, work, and leisure spaces, drastically reduced activities in supermarkets, bars, and restaurants. With most people at home, pedestrian movement has reduced to unprecedented lows in cities across the world. Google’s COVID-19 Community Mobility Reports compared changes in activity across the world to baselines measured from January 3rd to February 6th, 2020. Retail activity in Greater London dropped to as low as 89% of baseline in Mid-April, NYC transit ridership declined by an average of 79% in April, and time spent at home increased by over 30% in San Francisco County.¹ As a result, cities are experiencing an abrupt decrease in their social, recreational, and economic life.

Figure 1: Changes in Activity across the World from Google’s COVID-19 Community Mobility Reports

The decline in urban movement and commerce is also impacting the experience of public space and culture. Media outlets have reported a decrease in noise pollution by more than 30% (Bergan, 2020), and an increase in complaints concerning “noisy neighbors” (Krauth, 2020) — the changes in noise levels are so dramatic that it has even been detected in submarine environments (Thomson and Barclay, 2020). Recent work on urban sound has also detected noise previously muted by the idling engines, the jackhammers, and the honking of the city (Bui and Badger, 2020). The universal absences of festivals, performances, and street fairs celebrating spring-and-summertime give way to quiet, nature, and unexpected movement; human activity is down, but other parts of our cities and their often-lost sounds are thriving. In Sonic Cities, we explore how sounds change in city parks around the world. Parks bring together sounds of nature and the city, representing a refuge from the constant roaring of cities.

Sound Walks Before and After

For our analysis, we selected five key urban parks: Central Park in New York City; Golden Gate Park in San Francisco; Marina Esplanade in Singapore; Hyde Park in London; and the Public Garden in Milan. To analyze how sounds changed in these parks, we used audio collected before and during the pandemic. For the audio before the pandemic, we extracted audio files from a sub-genre of YouTube videos with long, un-narrated walks in parks that were filmed before January 2020. The recordings ranged between approximately 12 minutes (Milan and Singapore) to more than an hour (New York) and covered significant portions of the parks in our sample. These walks typically include key landmarks. For example, in Central Park, the route passes along Central Park Zoo, The Mall and Bandshell, and Central Park Reservoir, which provides us with rich sound information on important park locations. To record sounds during the pandemic, we followed a series of steps. First, we digitized the path taken by the original Youtube recording before the pandemic for each of the parks. We did this by tracing each of the source videos’ paths using the landmarks as reference points.² Then, we recruited volunteers via our social network to follow this digitized path and record the sounds using a smartphone.³ In addition, we also requested that volunteers record their GPS position as they walked. This information allows us to ensure that volunteers walked at similar speeds, making the paths comparable. We also encouraged volunteers to walk at the same time of day recorded in the Youtube videos before the COVID-19 pandemic. For each park, volunteers recreated the walk at least two times.⁴

Table 1 reports the characteristics of each walk for all parks. These characteristics include the total length of the walk, its duration (in minutes), and the dates in which the walk was taken (before and during the pandemic).

Table 1: Audio files recorded by volunteers taking the same path before and during the pandemic

Using Machine Learning to Classify Sounds

To classify the collected audio recordings into distinctive sound categories, we used machine learning techniques. The goal of the categorization was to recognize changes in sounds like human voices, emergency sirens, street music, sounds of nature (i.e., bird song, insects), dogs barking, and ambient city noise. To do this categorization, we follow a series of steps. First, we partition every sound file into a series of 4-second sound clips. Second, we construct mel log-powered spectrograms for each sound clip, using the Librosa Python library, in order to visualize the spectrum of frequencies of the sounds.⁵ We use the Mel Scale transformation because it’s similar to how humans perceive sounds. This is in contrast to the Hz scale, for example, where the difference between 500 Hz and 1000 Hz is noticeable, compared to the difference between 7500 Hz and 8000 Hz, barely perceptible by the human ear. The result is an image representation of each 4-second audio clip, as shown in Figure 2. A total of 5,444 audio clips were processed for the final visualization.

Figure 2: Mel log-powered spectrograms for a subset of sound clips

To classify each spectrogram into a sound category, we follow previous efforts that have documented how machine learning can be used to classify urban sounds. In particular, we use a Convolutional Neural Network (CNN) model trained on data obtained from Urbansound8K.⁶ The Urbansound8K data provides a classification of urban sounds for ten different categories, including children playing, car horns, and street music.⁷ Not all of the sound categories are relevant to our context, so we first classified sounds using the original Urbansound8K categories and then further grouped them into four broad categories: human activity, city sounds, birds, and sirens. The city sounds category encompasses all the background city noises, including sounds coming from bicycles, cars, or construction work. The nature category includes sounds related to birds, wind, and the rustling of leaves. The human activity category includes sports, talking, singing, and any other human voice-related activity. Finally, sirens include sounds from ambulances, fire trucks, or any other siren emitting source. We assigned each spectrogram to the sound category with the highest probability, which represents the most prevailing sound in that 4-second sound clip. For visualization purposes, we recorded the average loudness in 50-frequency bins for each 4-second sound clip (see Figure 3).

Visualization

Figure 3: Sound Mesh with color gradient indicating noise loudness in Central Park, NYC

The experience of COVID-19 for many, ourselves included, has resulted in a smaller world. Efforts to minimize exposure brought on shifts from transit commutes to trips through rooms, from walks through bustling streets to tactical navigation of public space and shops. The source videos and our volunteers’ field recordings serve as gateways to places that might now be risky or difficult to visit. The Sonic Cities visualization aims to express the sense of place, simultaneity of city and nature, and activity that urban parks possess. We wanted to convey the characteristics of space and time connected to these sounds, using techniques from mapping, media, and digital design together. The key toolkits for the visualization included geospatial data and analysis (QGIS, PyQGIS), audio analysis and data management (Python, Librosa, Pandas, GeoPandas), and interactive visualization (Deck.GL, D3, Recharts, React). Figure 4 shows the resulting visualization for the four different cities. Sonic Cities includes a series of 3D spatial audio maps and accompanying narrative website (see online here).

Figure 4: Sound Mesh with color gradient indicating noise loudness for each city

The geospatial processing to create the visualization had a few main steps. First, we digitized the walked path from the source videos, which we use for visualization and the volunteers’ navigation. To do this, we traced each of the paths from the source videos for the five parks using the landmarks as reference points. We then simplified and smoothed the path’s geometry to avoid harsh angles that could negatively impact the legibility of the audio overlay while still keeping some fidelity to the original path. Panels 1 and 2 in figure 5 (from left to right) show the resulting digitized path and the simplified path, respectively. Once smoothed, we divided the path into equal-length segments equivalent to the number of audio clips in the sample. From each of these segments, 50 points were drawn perpendicular to the segment, totaling 150 meters across. This roughly approximates what you might hear immediately around you when walking in the city (see panel 3 in figure 4). Finally, we parsed the audio data into sound levels at 50 pitch ranges and joined those to the point grid. We followed a similar procedure to merge the classified sound categories from our machine learning prediction to the point grid. The final output data was line data (origin XYZ to destination XYZ) tagged with the location in the path (unique id), its corresponding loudness level, and the classification from our machine learning model (Human Activity, City Sounds, Birds, Sirens).

Figure 5: Maps illustrating the construction of the spatial audio map geometry using the walked path in Central Park, NYC

Once the data was processed, we began to work on the front end. Given the amount of visual complexity and performance needed, we opted for a webGL enabled library, Deck.GL. The built-in layer features for Deck.GL allowed us to keep much of the computation cost in pre-processing while keeping the file size for the data manageable. This approach enables the front end to build a 3D wireframe of the data, play sounds attached to that part of the trip, and filter for different sounds. The final website used a scrolly-telling introduction utilizing Scrollama, a javascript library for handling scroll steps, Recharts, a variant of D3 optimized for React, and Material-UI, a front-end component library to streamline development.

What Sounds Changed?

Overall, our analysis points to an increase in detections of birdsong and a decrease in human activity-related sounds, such as cars driving by, or construction work. This is particularly salient in cities like London and New York, where nature sounds increased by 29% on average and human voices decreased by 26% on average (see Figure 6). In contrast, the study of Singapore’s Marina shows an increase in talking and singing, but city sounds such as vehicles passing and street work, were cut down by over half. This complements work that has documented a decline in noise pollution as a result of the sharp reduction in human activity (Corbera et al., 2020; Zambrano-Monserrate et al., 2020; Paital, 2020). Similarly, Milan also experienced an increase in human activity-related sounds, while also experiencing a drop in sounds such as trucks passing and dog barks. In San Francisco, levels of natural sounds remained fairly constant, but city noise increased by around 40%, replacing human activity. In London sirens decreased the most (7.7%), although overall the decrease was subtle across all parks.

Figure 6: Graphs showing the change in classified sounds across parks

Future Work

City residents around the world have experienced unprecedented changes in how urban environments operate. Sonic Cities aims to complement existing work that documents the importance of studying sounds in other urban spaces such as building facades (Calleri et al., 2018), sidewalks (Bello et al., 2019) and streets (Aiello et al., 2016). Future work in urban parks could increase the number of parks represented and further refine the audio processing and deep learning model. Particularly, understanding how nature has been revealed in metropolitan areas outside of Europe and North America will yield a better understanding: São Paulo, Mumbai, Mexico City, Istanbul, and Moscow are at the top of our list.

By focusing on urban sounds, Sonic Cities can help us think more broadly about how urban activity impacts our daily life, and the ubiquitous sounds we hear in our cities. COVID-19 provides an opportunity to explore the presence and dynamics of the sounds we hear and how these new soundscapes affect people’s experiences of parks in the city. It’s on us to decide what sounds we value when cities go back to normal.

All authors are researchers at MIT Senseable City Lab

[1] https://www.google.com/covid19/mobility/

[2] See the visualization section for a more detailed description of the tracing process.

[3] Our approach builds on a long tradition of using walks to survey the soundscape. Such approaches have traditionally used human perception to audit the environment, here we leverage sound recordings and combine it with machine learning techniques to interpret the soundscape. See (Adams et al., 2008) for a detailed description on the soundwalk methodology.

[4] In some of the walks it was not possible to repeat the walk the same day because of weather conditions. In these cases, volunteers recorded the walk within 1 or 2 days.

[5] The Librosa library converts the sampling rate to 22.05 KHz and normalizes it so that the bit-depth values range between -1 and 1. Our decibel output was determined by the audio file type data range, in our case ±32767 for 16-bit PCM WAV files.

[6] This model works with a tensorflow backend and Keras API

[7] See here for a complete documentation of the Machine Learning process, and here for more information on the urban sound classification.

Sonic Cities: Listening to Parks During Lockdown

Sound Walks Before and After

Written by Arianna Salazar Miranda