Rethinking ‘distance’ in New York City

by Will Shapiro & Mahir Yavuz

Understanding cities — their neighborhoods, zones, dynamics, and relationships — has a long history in architecture, urban studies and urban planning. Trailblazing thinkers from Dinocrates to Jane Jacobs developed different paradigms for how to understand and shape the patterns of urban life.

More recently, we have seen the development of data-driven approaches to urbanism, from researchers such as Livehoods, Inequaligram, and IQuantNY. This inspiring work reveals that marrying data-driven thinking with urban theory can radically re-shape the way we understand, improve and transform cities in the future — as businesses, governments, citizens.

We started Topos earlier this year to further advance the understanding of cities and neighborhoods in light of the technological sea-change underway in the realms of Big Data and Artificial Intelligence. Pulling from a wide variety of technologies and disciplines — computer vision, natural language processing, network science, machine learning, statistics, topology, urbanism, data visualization and information design — we are creating a transformative, globally scalable platform for understanding urban life and the culture of neighborhoods.


Many Kinds of Distance

At the heart of this endeavor is a simple but profound question: what does distance mean in the 21st century? At some point in history, geographical proximity may have implied cultural proximity, but this is increasingly being called into question, as can be seen in everything from voting patterns to the distribution of third-wave coffee shops across the United States. We intuitively felt that Bushwick, Brooklyn has more to do with Silver Lake, Los Angeles than any neighborhood in Albany, New York despite the 3000 miles in between. From this starting point we constructed a suite of metrics about neighborhoods encompassing everything from topological analysis of the built environment to, yes, the ratio of third-to-first wave coffee shops. In a sense, we’ve developed a ‘psychograpics of neighborhoods’, going beyond more familiar demographic viewpoints to capture the personality of a place, and what it feels like to be there. We have used these metrics to arrive at a new understanding (several, really) of how neighborhoods relate to one another.

From a machine learning (ML) point of view, having a strong definition of distance between entities — or, better yet, multiple definitions of distance — can be a powerful tool. Amongst other ML algorithms, mathematically defined distance enables the application of clustering algorithms to a set of entities, allowing them to be grouped together in different ways.

Three different neighborhoods in New York City. We use computer vision to differentiate various urban typologies.

We decided to test out some of our distance metrics by performing hierarchical clustering on the residential neighborhoods[1] of New York City. We computed 3 different clusterings: the first looked solely at features related to the built environment, the second looked solely at features related to culture (such as coffeeshop vibes) and the third combined the two approaches. We decided to exclude any explicit demographic or economic indicators from the clustering input, but found some very interesting relationships between various clusters produced and these indicators (see the first two insights below).

[1]: In this instance, we made the problematic assumption that zipcode = neighborhood. Our next blog post directly challenges this assumption :)

Left: All levels of NYC clustering dendrogram. Right: The first 100 levels of NYC clustering on the map.

The Biggest Split

The most fundamental division produced by our clustering algorithm groups together most of Manhattan — excluding Harlem — with the areas of Brooklyn closest to downtown Manhattan.

The biggest split: Manhattan (without Harlem) and the areas of Brooklyn closest to downtown Manhattan.

Despite the fact that we didn’t include any explicit economic indicators in our input data, comparing these two clusters reveals some radical economic differences: The average house value of cluster 1 ($818,318) is almost twice that of cluster 2 ($488,513) and the median household income for cluster 1 ($91,679) is also roughly twice that of cluster 2 ($57,192).

In contrast, the average age of cluster 1 (36.8) is nearly identical to that of cluster 2 (36.1)


The Closest Connection

Based on our overall similarity metric, the two most closely related neighborhoods in NYC are Howard Beach, Queens and Seaside[2], Staten Island.

Top:Howard Beach and Seaside are separated by 25 miles. Left: Howard Beach, Queens. Right: Seaside, Staten Island (Satellite Photos © Google)

These two neighborhoods are about as far apart as two New York City neighborhoods can be geographically — it takes 7 hours to walk the 25 miles that separates them — yet they have much in common, from their waterfront parks to the range of cuisines available (To Do: Linguine and Clam showdown between Vetro on the Bay in Howard Beach and Giovanni’s in Seaside?).

Again there is an interesting relationship between the economies of these waterfront neighborhoods, despite any explicit economic input data: the average house value in the two neighborhoods are remarkably similar ($534,700 for Howard Beach vs $491,800 for Seaside)

[2]: Seaside is the largest neighborhood grouping within the 10312 zipcode, and we use it to refer to the 10312. However, the fact that Seaside is contained in 10312 but not equal to 10312 demonstrates one of the challenges inherent in looking at zipcodes. Tune in next month for more on this contentious topic.


Riverside South (Trump Place) Stands Apart

Riverside South, which spans along Manhattan’s westside from 59th to 72nd streets, is separated both culturally and formally from it’s surroundings.

Left: Riverside South is an urban island. Right: Aerial photo of Riverside South (© Big City Aerial Photography)

This largely residential area — complex might be a more appropriate word — has spurred controversy ever since Donald Trump proposed converting what was once a rail yard into a series of high-rise luxury apartment buildings in the 70s (most recently, tenants demanded that ‘Trump’ be dropped from all building signage).

As mentioned above, we constructed clusterings using, on the one hand, only formal analysis of the built environment, and on the other hand, only metrics related to the culture of the neighborhood; in both cases, Riverside South was an isolated island within the sea of surrounding westside neighborhoods, all of which were grouped into a different cluster. Whether it’s the complete lack of nightlife or the spatial experience of walking through a canyon of nearly identical residential towers, Riverside South is radically different from the vibrant westside neighborhoods it is surrounded by.


Williamsburg, Manhattan

Culturally, Williamsburg has more in common with lower Manhattan than it does with the rest of Brooklyn.

Even when looking at nearby (and oft-compared) Greenpoint and Bushwick, the cultural experience (whether that means running into a yoga studio or a Michelin-starred restaurant) of Williamsburg resonates more strongly with the chunk of Manhattan just across the East River.


Left and Center: Cast Iron Buildings in Soho. Right: Soho separates culturally from surrounding Lower Manhattan.

Soho is Special

In addition to the unique cast iron architecture we all know and love (The SoHo Cast Iron Historic District was declared a National Historic Landmark in 1978), Soho is set apart culturally from the rest of Lower Manhattan, and NYC more broadly. Shopping, eating, going out, being vegan, looking at art…it’s a different story in Soho; Perhaps it’s unsurprising then, that living next to Prada, Louis Vuitton and Balenciaga also comes with a different price tag: Apartments in Soho are 30% higher than apartments in surrounding Lower Manhattan.


Dense, Gridded Manhattan

Examining the dendrogram of clusters produced from a purely formal analysis of the built environment of NYC reveals a large, persistent cluster of gridded, dense Manhattan neighborhoods.

Animations of the first 23 levels of the built environment clustering, with dense, gridded Manhattan shown in red. Left: Dendrogram. Right: Map view

Comprising Midtown, the Upper East Side (up to Yorkville), Chelsea, Murray Hill, Greenwich Village, the East Village and Soho, this cluster is defined by highly regular, uniformly gridded space. Equally interesting are the neighborhoods that are excluded from this grouping like the West Village, with its labyrinthine streets at skew angles to the rest of the city, or Hell’s Kitchen, which falls to pieces at the entrance to the Lincoln Tunnel.

Left: Dense, uniformly gridded Manhattan near Union Square. Center: The labyrinthine streets of Manhattan’s West Village. Right: The entrance to the Lincoln Tunnel

Manhattan is a Very Diverse Island

Examining the dendrogram starting from the bottom — where all neighborhoods are in their own cluster — we find that neighborhoods in Manhattan remain isolated for much longer than neighborhoods in other boroughs. This implies that while other neighborhoods (Like Howard Beach and Seaside mentioned above) join groupings, neighborhoods in Manhattan are more distinct, remaining separated both from one another and from the rest of New York City.

While the majority of neighborhoods in other boroughs are joining clusters, neighborhoods in Manhattan remain distinct.

More specifically, it takes 40 levels of clusters forming for any neighborhood in Manhattan to be connected to another neighborhood; in this case, Southern Harlem joins Manhattanville. Meanwhile throughout the rest of NYC, 73 neighborhoods have been grouped together, covering a majority percentage of zipcodes in the Bronx (52%), Brooklyn (51%), Queens (54%) and Staten Island (67%). It takes another 17 levels of clustering for the next Manhattan neighborhood to be connected, when Hamilton Heights joins Prospect Lefferts Gardens.


This post is the first of an ongoing series capturing different insights we generate while developing our platform. We would love to hear your feedback. If you enjoyed this article please share and hit the 💚 to recommend.

Topos: Transforming the way we understand cities with Artificial Intelligence.