Gaining Insights about Real Estate Properties From Online Reviews Using NLP

Or Hiltch
5 min readOct 31, 2019

--

When assessing an investment opportunity in real estate, it is often the case that the best story could be told by the property’s tenants.

When Skyline AI’s team underwrites an investment, one of the important factors revolves around getting insights about the company managing the property.

Do tenants appreciate their service? Do tenants believe this property is a better product and therefore willing to pay more for it? Is the property manager doing a good job of maximizing the property’s returns?

We believe that the best value could be derived from a property by delivering a good product and a good tenant experience. A good deal for the investment manager needs to be a good deal for the tenants. If they like the property and appreciate the service they will feel they are getting their money’s worth, and it only makes sense that a better product would cost more.

If many of a property’s tenants believe that a few key things need to be improved — for instance, perhaps certain in-unit amenities are desired, then it may be a good idea to invest capital in adding these.

For this reason, even though Skyline AI digests hundreds of different sources about the property’s financials, geospatial environment and more, an important part of our analysis revolves around the tenant’s opinions, and especially about the tenants' sentiment.

That’s where analyzing online tenant reviews using NLP comes into play.

Extracting Insights with Natural Language Processing

Using NLP algorithms, we could understand the structure of text phrases and extract useful insights. When analyzing user reviews, typically, we will be interested in:

  • Sentiment — We want to determine the emotional sentiment of a review. Sentiment can be positive, neutral, negative, or mixed.
  • Entities — Entities are objects such as management companies, people, places, and locations.
  • Key phrases — We want to extract key phrases that appear in a review. For example, a review about a roach problem might return the roaches as a key entity, names of people from the property management team, the name of the property, and the fact that there’s some problem.

Getting Property Reviews

Historically, there used to be a lot of websites where tenants could rate properties. However, like the case of Yelp becoming obsolete with Google Maps Restaurant Reviews, Google Maps has pretty much made those sites redundant as it looks like tenants are already rating on Google Maps more than they do on other sites.

Google Maps offers easy to use server-side and client-side APIs to retrieve reviews, in particular, the Place Details API using the Place ID.

https://maps.googleapis.com/maps/api/place/details/output?parameters

A response from the API looks like so (JSON format):

Determining the Emotional Sentiment of a Review

So, if the API returns rating (“1” in the case of the JSON above), why do we need to analyze the sentiment ourselves?

Well, consider the following review (author name censored):

This is obviously a mistake — it’s a 0-star review that for some reason was assigned a 5-star rating.

If we end up training a model based on such wrong labels, our model would end up thinking that calling an older woman “baby girl” on the phone is desirable. Enter NLP

Amazon Comprehend

Image from https://aws.amazon.com/blogs/machine-learning/analyze-content-with-amazon-comprehend-and-amazon-sagemaker-notebooks/

Amazon Comprehend is a machine learning powered AWS service that makes it easy to find insights and relationships in text. It uses a pre-trained model to analyze text to gather insights about it. This model is continuously trained on a large body of text so that there is no need for you to provide training data, unless you would like to use the service to train a custom model.

Using AWS SDK for Python (Boto), we could make a simple request to Comprehend to analyze the same review from before. We’ll use the detect_sentiment function:

This is the response:

As we can see, Comprehend classified the text as having a Negative sentiment with a confidence level of > 99%. Pretty cool.

Now, what entities are related to this negative sentiment by the tenant?

Insights About Key Entities And Phrases

Running the same code with the detect_key_phrases function instead yields the following:

We can detect complaints about:

  • The office
  • The management
  • Power outages when it rains

All pretty useful for our analysis.

Summary

We have seen that using a few APIs we can acquire and analyze reviews to gain insights about tenant sentiment and in order to detect potential problems with the property.

Running the same type of analysis over hundreds of reviews could help us get a good understanding of property management companies, investment managers and the kind of issues tenants are sensitive to.

These are very powerful tools for improving the quality of the property and having a better understanding of its value-add potential

Plus, we can do all of these using state-of-the-art ML-powered NLP services like Amazon Comprehend, which could improve a naive scheme of just using the rating system of some third party like Google Maps.

--

--

Or Hiltch

Founder, Skyline AI (acquired by JLL). Founder, StreamRail (acquired by ironSource, part of Unity)