ML APPLICATIONS

Designing the 15 Minute City with AirBnB Reviews and Semantic AI

How Semantic AI helps city designers move beyond the objective data to draw insights from experiential data sources.

Rutvik Deshpande

Published in

Digital Blue Foam

7 min readMar 7, 2022

By Rutvik Deshpande + Sayjel Vijay Patel

The creation of all-knowing artificial general intelligence (AGI) [1], like J.A.R.V.I.S from Iron Man, remains years away. However, great progress is being made in one aspect of AGI focusing on consuming, interpreting, and classifying information shared by humans called Semantic AI [2].

Semantic AI

Semantic AI unites two branches of computer science. The first is Natural Language Processing (NLP) — an area of computer science looking at modeling how humans understand spoken words and text. The second is knowledge-graphs (or semantic networks); a particular way of representing and storing information as interlinked entities.

At Digital Blue Foam (DBF), we are pioneering a new application of semantic AI to help our users inform the design of the 15-minute city — a rising trend in urban planning, towards arranging all necessary amenities and services within a 15-minute walking distance for all residences.

Towards Semantic Neighborhood Design

Recently, we introduced a web-based tool to help designers, planners, and developers to index how well a specific location performs as a 15-minute city with a Neighborhood Quality Score (NQS). For this project, the biggest challenge was to hunt down, collect, and verify location-specific data to feed the analysis.

In the first version of the application, we made use of existing map APIs such as OpenStreetMaps and Google. However, to provide a truly comprehensive solution, relying on these pre-cooked data sources presents some profound issues:

Data-sets are often incomplete or not up to date
Data-sets focus on objective criteria such as address, building shape w
The information does not tell us what is important about a location (ie. is it touristy, quiet, historic)
Global Information APIs can be cost-prohibited

Case Studies: AirBnB and the 15 Minute City

Our team recognized quickly a need for our contextual data sources to improve the accuracy and potency of our Neighborhood Quality Index. While researching options, we came across a radical solution — you guessed it, Semantic AI! We determined Semantic AI would allow us to extract neighborhood-specific keywords and summaries from alternative datasets such as social media reviews. This could help us evaluate neighborhood quality by helping us instantly collect qualitative data for a location.

Step 1; Hunting and Gathering Data

After defining the problem statement, the next step was to hunt and gather relevant sources. Typically this is a tedious, costly, and time-consuming process for urban designers. We explored several different APIs, such as foursquare, and Flickr, and ultimately chose AirBnB’s open-source dataset Inside Airbnb. This is because the information AirBnB collects focuses on practical, location-specific, insights to help their customers rate, rank, and compare different properties based on criteria such as proximity to public transport stations, noise levels, parks, , and other important characteristics. In turn, user reviews can create a better perception of neighborhoods, districts, and even cities. An additional benefit is the global availability of these datasets, which are crowd-sourced from individual reviews and opinions.

**User Interface (UI) of Airbnb’s platform, showing the details of listings, which could help users summarize a neighborhood. Image Source :** **Airbnb**

The dataset includes detailed listings and reviews in a tabular format and the neighborhoods of Singapore in GeoJSON format. Since our project focuses on mining insights from qualitative inputs, we extracted features like “Neighborhood Overview ‘’ and “Guest Review” for further analysis. These text-based features could be even correlated with other numerical features like the number of reviews, price, and different review scores, to get an in-depth analysis of a particular location.

Step 2; Summarizing User Reviews

Text summarization is one of the use cases of Natural Language Processing, referring to highlighting and framing important information from a large chunk of text or document. The next step: Airbnb’s user-generated reviews combined with neighborhood overviews written by guests, travelers, and hosts, for different neighborhoods of Singapore, were summarized in a shorter paragraph of 5–7 lines. This task was done using Natural Language Toolkit (NLTK)(NLTK), an open-source collection of python libraries, programs, and resources for building NLP models.

**Model Pipeline showing the process of converting merged text data into meaningful summarized paragraph. Image by Author**

Step 3; Summarizing and Visualizing Reviews

In the last step, we concatenate the reviews for different neighborhoods, and then remove special characters, extra spaces, and stop-words. Next, we tokenize the text; this means converting the words into numbers while considering the position of words in sentences. We then measured the occurrence and frequency of keywords and passages within the reviews. Finally, we extracted the most frequent sentences and keywords to create a summary of the most common elements of a location. The resulting summary informs a more qualitative understanding of the neighborhood in contrast to more objective map data.

**Kallang, the centrally located popular neighborhood, has the most Airbnb listings and reviews in Singapore. Image by Author**

**The summarized information about the Novena neighborhood, Singapore, displayed on the Map page of the web-based DBF tool. Image by Author**

How this Helps City Designers

Datasets like OpenStreetMaps rely on crowd-sourcing of information; often, for some cities, information could be sparse or out of date. Furthermore, paid datasets like Global Information System (GIS) focus on numerical or geometric attributes, such as buildings and streets, but fail to recognize hotspots and popular places — something easily retrieved from social media, real estate listings, city guides, and reviews data. Furthermore, Semantic AI unlocks new possibilities to study qualitative factors for a specific location such as walkability, human perception, urban diversity, safety, serenity, cleanliness, and sense of place.

In this new data layer, user experience is translated to feedback and reviews for many other users, adding another dimension to urban analytics. We believe, these insights will help city designers choose the right building type and function to enhance a particular location.

We believe these insights will help city designers choose the right building type and functions to enhance a particular location.

What’s Next?

This approach is not only limited to Airbnb reviews, and can be expanded by combining more relevant datasets from different domains as mentioned above in Figure 5. The given sources can add in more information:

Information Services, ex, Wikipedia: Could offer information on a broader level like transit stations nearby, history of the neighborhood, demographic information.
Property & Real-estate Platforms like Zillow, PropertyGuru, Mogul: would be helpful in tracking property value trends, along with assessing the quality of listings in a particular neighborhood.
Places Guides: Yelp, Tripadvisor, Foursquare, Google Places, and maps: This can be used to retrieve the popularity of amenities, such as ratings and reviews of restaurants, theaters.
Transit & Movement: Grab, Uber: Uber released a public dataset consisting of billions of trips, and calculated travel times, and prices. This could help users understand the real-time taken to reach different neighborhoods from a given location for 100s of cities.
Social Media reviews, Reddit, Twitter: Tweets and posts could eventually help us understand the hotspots in each locality, understanding the correlation of emotions and public spaces.

**Geolocated text data (user reviews) can also be retrieved from various other open-source resources and platforms. Image by Author**

Conclusion

In this post, we focused on one example, specifically using Airbnb reviews to inform 15-minute city design, however, this can be expanded to new applications within urban design and planning; for example, Google Street View Images (SVI) can be used to visually inspect and compare the composition of different architectural and natural elements, like streets, trees and cars, within a particular neighborhood. Growing accessibility to Semantic AI helps new people, like urban designers, get useful insights from previously impenetrable data-sets, like social media reviews. These insights can help inform better decisions, especially at the neighborhood scale.

N O T E S

1 An Executive Primer on Artificial General Intelligence | McKinsey. https://www.mckinsey.com/business-functions/operations/our-insights/an-executive-primer-on-artificial-general-intelligence.

2 Why Machine Learning Needs Semantics, Not Just Statistics. Forbes, https://www.forbes.com/sites/kalevleetaru/2019/01/15/why-machine-learning-needs-semantics-not-just-statistics/.

3 Six Core Aspects of Semantic AI — DataScienceCentral.Com. Data Science Central, 14 May 2018, https://www.datasciencecentral.com/six-core-aspects-of-semantic-ai/.

S O U R C E S

Getting started with Google SVI API; https://developers.google.com/maps/documentation/javascript/streetview

Inside Airbnb. Adding Data to the Debate; http://insideairbnb.com/

Foursquare Check-ins Dataset; https://sites.google.com/site/yangdingqi/home/foursquare-dataset

Wikipedia Geosearch; https://www.mediawiki.org/wiki/API:Geosearch

Uber Movement Data; https://movement.uber.com/?lang=hi-IN

Zillow Housing Data; https://www.zillow.com/research/data/

About the Authors

Rutvik Deshpande is a ML Design Engineer at Digital Blue Foam. He aims to blend Artificial Intelligence in Architecture, with his work focusing on data-driven design workflows in both urban and architectural scales. Rutvik has co-led technical workshops at prestigious global conferences like, CAADRIA and DigitalFutures.

Sayjel Vijay Patel is the co-founder and CTO of Digital Blue Foam and founding Assistant Professor at the Dubai Institute of Design and Innovation. He is a graduate of the MIT School of Architecture and Planning. Sayjel won the acclaimed Red-Dot Design Award for his research developing conceptual design software for the 3D printing industry.

About Digital Blue Foam

Digital Blue Foam (DBF) comprises an elite mix of designers and technologists from around the world who share a strong commitment for empowering a revolution in architecture, engineering, and construction (AEC) industries toward carbon-negative projects by leveraging data-driven, AI-powered, collaborative, and sustainable approaches. We embrace collaboration and sponsorship, and we thrive at offering customized solutions that make designing a hassle-free and intuitive process. To learn more about Digital Blue Foam, visit our website.