Pricing Homes like Agents Do: AI for Real Estate CMA Adjustments

Published in

Compass True North

11 min readDec 16, 2021

George Valkanas, Eda Kaplan, Panos Ipeirotis & Foster Provost

[Thanks to Compass’s NYC_AI team!]

Homeowners selling their homes need to choose a listing price. The listing price affects the final selling price, how long the home spends on the market, the volume of interest in the house, and anchors price negotiations with buyers. However, most homeowners are not experts in selling residential real estate, so they engage real estate agents to assist them with pricing strategy (among many other things).

Even the availability of AI-pricing strategies has not reduced the need for human assessors. On the contrary, there is considerable evidence that humans add essential information to sale-price estimates. For example, popular algorithmic estimates have different performances when they take the list price into account (see the performance difference between the “active listing” estimation vs. “off-market” estimation). Not surprisingly, when they take the list price into account, algorithmic estimates are substantially more accurate. (While we might attribute this at least in part to price anchoring, it is part of the standard selling process for almost every home, and therefore, it must be taken into account when predicting sale price.)

Home Valuation, Pricing, and Appraisal

This post is not about automatic home-price estimation, though. Instead, we focus on combining human intelligence and artificial intelligence for home pricing. At the risk of repeating: the goal is not to predict the selling price but to help the agents and their clients to set an effective list price.

What is Comparative Market Analysis?

By far, the most common approach for residential home valuation is the Comparative Market Analysis or CMA. A CMA comprises a collection of recently sold “comparable” homes (“comps”) that, taken in aggregate, can provide a view on the value distribution of the appraised home (the “subject” home).

The CMA provides a solid basis for pricing discussions between homeowners and agents: The discussion revolves around how much buyers were willing to pay for comparable homes and the salient differences between the comps and the subject home.

This post describes how we use AI to support the quick and accurate creation of high-quality CMAs. We focus here on adjusting the comps to give the most information on the subject home’s value. The ultimate goal of this work is for every agent who uses the Compass platform to create top-notch CMAs quickly, either improving over the CMAs that they would otherwise have made or reducing the amount of time they spend on creating CMAs.

Home Valuation vs. Home Pricing

Pricing homes is complicated for several reasons.

Each home is unique — there is no other exactly like it. For example, consider simply this: each house has a unique location — and what are the three most important factors in real estate? Location, location, location.
Homes have different sizes, room compositions, garage space, amenities like a pool, etc.
Even for agents with experience and expertise in valuing homes, it is time-consuming and requires research and keen attention to detail.
Despite what you might infer from popular home value estimates, a home does not have a specific dollar value. Instead, a home has a value distribution as different potential buyers place different values on the various home features. The eventual selling price is a function of this value distribution and the specific individuals who consider the home.

Because of this, as part of a home pricing strategy, we should try to generate a distribution of values, which can then be taken into account when setting a single list price. Thus, the CMA comprises data points showing the value some buyers placed on each comparable home. (Comps can also include active listings, for which we do not have a sold price yet; they are typically less helpful than actual sales.)

Adjusting for Differences between the Subject Property and Comps

The major challenge in creating a maximally informative CMA is that even carefully selected comparable homes are not identical to the subject home. They will have different sizes, room compositions, amenities, locations, etc. Furthermore, the comp did sell at another time, and even if the sale is recent, the real estate market can change fast.

Therefore, an integral part of creating an informative CMA is to “adjust” the comps to reflect the information they provide on the value of the subject property.

Our goal is to compare the houses in the CMA, identify their differences with the subject property, and provide a price adjustment for each such difference.

For example, consider the following scenario: a comparable property is an identical apartment in the same building, sold recently for $550K, located two floors higher. Typically people pay $25K for each extra floor in this area. So the adjustment, in this case, is -$50K, and after this adjustment, the original $550K price of the comparable becomes $500K when used to price the subject property.

Although we are not creating home appraisals, we can draw on appraisal literature, providing deep, comprehensive knowledge of value adjustments.

AI Adjustments in Action

Let’s illustrate these adjustments by walking through an example where the AI makes recommendations to adjust some comps better to reflect the subject property’s value.

From the perspective of the actual methods and models, there are different data-driven models for each type of adjustment, so we will give some details about a few of those models as we go along.

Start with the Subject Property

So, a CMA starts with a subject property. Let’s start with 225 E. 34th Street, Unit 10F, in the Murray Hill neighborhood of Manhattan (in New York City). The location is marked using a star on this map; we also see a bunch of other homes in the vicinity:

We will present a case we have discussed previously, and so if you want more details and discussion, you can watch a companion video discussion from last year.

Highlighting the Differences

The next step in creating a CMA is to choose the comps that we will analyze. Ideally, comps are very similar properties, and usually, we would want them to be in the same neighborhood. (The Compass AI team previously wrote about AI systems for finding similar homes.) The other labeled spots on the map correspond to properties that have recently sold or are active listings.

So here is a view of the subject property (leftmost column) and four selected comps:

The top row shows the address of the properties and a peek into each comp (the subject property doesn’t have a picture yet, because it is not yet listed). Then, below each is a price and a list of different “features” of the property. The price is from when the comp most recently sold, or the list price if the home is currently on the market. The feature values for the subject property represent the values from the last time it sold.

If everything about these homes were identical, then the sold prices of the comps would represent samples of the values buyers were willing to pay for the home, which would be a close approximation of price points from the value distribution of the house.

The challenge in producing a high-quality CMA is that the subject property and the comps are not identical. Therefore, the most challenging analysis in the Competitive Market Analysis is to adjust the comp’s selling prices to estimate what the prices would have been if the comp were identical to the subject property.

Recommending Price Adjustments

Our AI-supported tool gives our agent recommendations, suggesting ways the comp differs and recommending a corresponding price adjustment. Let’s see that in action.

Let’s focus on the comp in the middle column, where we have highlighted some things in red. This apartment in the same building of approximately the same size sold less than a year ago. The apartment would be a very attractive comp, because being in the same building controls for many things. This home sold for $1,100,000. Is that an appropriate value to use to inform the pricing of the subject property?

To see the price adjustment recommendations, we would click on +Add Adjustments below the comp’s address. The result is an Adjustments box:

We see suggested adjustments that can be chosen or rejected. Choosing them will add them to the list of adjustments on the left, and the Total value will change accordingly.

Market Price Adjustment

Let’s focus on the first of these suggested adjustments: Market Prices. This comp sold more than six months ago, and the market has moved since then. Adjusting for market prices is a crucial adjustment to consider for any previously sold comp: The market is continually moving up and down due to factors including inflation, seasonality, and shocks to the system, like from COVID in this case. We should adjust the price of the comp to reflect that difference. That’s a data science problem.

To do that, we build local “price indexes” — similar to the price indexes you hear about in the news, except that we create them for finer localities. How do we build these models? Since we have many homes that sell at different periods, we have lots of training data. Using the repeated sales model, we estimate how the price-per-square-foot evolved for each locality. Then we can use the model to estimate the difference in average price per square foot in the locality across any two-time points.

For this example, the model says that property prices have decreased by 4.6% in Murray Hill since the sale of this property. Thus, to make this comp comparable now, we would need to subtract $51K from the price it sold for back in April 2019.

Note that temporal price adjustment is just one type of model that underlies the AI adjustments; our system has thousands of instances of this model for different fine-grained localities.

Building-Floor Adjustment

The following figure shows a different subject property and comp to illustrate some other important AI adjustments.

The subject property and comp property are again in midtown Manhattan. Check out what the system is calling out as an essential difference here. The comp is nine floors higher than the subject property. In Manhattan, apartment prices are affected based on the floor in the building because higher floors are less noisy and (all else being equal) tend to have better views. In this case, the model (learned from prior sales data) estimates that a nine-floor difference would result in a price that is $49K higher. Thus, to use this comp’s sale price to estimate the subject property’s value, we would need to subtract $49K.

This adjustment requires an entirely different underlying data science modeling effort from the first one we discussed. First, we need to figure out how best to model the differences in selling price by floor. It turns out that this varies by building. We have seen differences in the list prices of $50K per floor (which can likely be even more in new, ultra-luxury buildings). However, we cannot simply build one model per building because, for smaller buildings, we do not have sufficient prior sales to use as training data. Our solution is to aggregate locally for smaller buildings. Note also that we should take into account the temporal market adjustment for this modeling. To examine the differences in prices by floor, we need first to eliminate variation due to the apartments selling at different times — so we first should adjust the prices to the same time (see above).

Location, Location, Location

Ok, let’s look at another sort of adjustment. Here we revisit our first subject property on East 34th Street and as a comp choose a nearby property on East 40th Street. In this case, the system points out that even though these properties are pretty close to each other, the comp’s local area is significantly lower-priced: 13% less expensive.

The idea here is: what if the comp property were actually in the subject property’s location? If that were the case, our models estimate that its value would be $125K higher. Thus, to make it genuinely comparable, we should adjust the sale price of the comp by $125K (in addition to any other adjustments).

To illustrate more broadly, here is a map that shows the locally aggregated price per square foot (ppsf) of various locations in Manhattan:

Dark red corresponds to the highest prices in the heatmap and dark blue the lowest, relatively speaking (the colors would correspond to different prices in different regions). We see the highest prices in the neighborhoods we would expect: along Central Park, and in trendy neighborhoods like SoHo, TriBeCa, and the West Village.

We can examine our Murray Hill case a little more closely by zooming in:

The dotted line is East 34th Street, and our subject property is in the building indicated with the star. Our comp is just six blocks north (“Manhattan north”). As just mentioned, the colors correspond to a price per square foot (ppsf), with red being the highest and blue being the lowest (always relative to the area — even the blue ppsf here would seem very high outside of Manhattan). In this case, we see that although different blocks do have varying “attractiveness,” as proxied by ppsf, much of the difference in this area seems to come from the particular buildings. Such differences make sense, as buildings are of different ages, have various contractual constraints, and offer vastly different amenities. Our subject property happens to be in a very expensive building for this neighborhood. The comp property is in a building where similarly sized apartments tend to sell for lower prices.

The models underlying this suggested adjustment are spatial nearest neighbor models for ppsf. To prepare the training data, we adjust prior sales with the temporal models discussed already.

Conclusion

We have illustrated how we can create a “human-in-charge” AI-assisted creation of CMAs. The key idea is to make the life of an agent easy when they create a CMA by streamlining the following processes:

Suggest good comparable properties for a CMA
Highlight the important differences between the comparable and the subject property (the fewer, the better)
Suggest pricing for the differences

While the system suggests comparables and adjustments, the agent is in charge of which properties to use as comps and which adjustments are relevant and appropriate for their CMA report. Ultimately, the agent and the client decide the final pricing strategy, and the market provides final feedback by revealing (after the fact) the time-on-market and the final sale price. In the future, we will discuss how we use these signals to improve our CMA tool’s quality further.