Image by Nattanan Kanchanaprat from Pixabay

Real Estate Automated Valuation, our present or distant future?

10 min readDec 29, 2022

Having in mind the different organizations in the real estate industry and their own definition of what an automated valuation model (AVM) is, the safest way to define it is as an automated process that uses mathematical modeling techniques for providing real estate valuation. Of course, there are various nuances in definitions depending on the type of real estate, as well as nuanced application of the resulting valuation, whether for a single or a large number of properties. And even though the aim is to make it autonomous in the sense that it doesn’t require human interference, we are still far from it. The age of big data and cloud technologies has provided us with not only resources in terms of storage and processing power but also a plethora of tools for orchestrating the entire automated process. What is left is to produce well-structured data that can be used for the greatest results and also find a way to make a synergy between real estate valuers and AVM as a tool, since they are and should be complementary.

The idea of this and the next couple of articles is to look at the AVM from the perspective of data science and machine learning. Domain knowledge is always required in data science, so I will first go through some of the terminology, definitions, and approaches in real estate valuations. Again, not in depth, but enough for you to see what AVM is supposed to do and the results it should produce. Another important thing to mention is how much data can expected to have in various markets. After that, I shall go over the main types of modeling techniques and summarize their pros and cons. At the end, I would like to implement some of these on a test dataset and do some visualizations.

Real Estate Valuation

The whole valuation process is very complex in its nature, as it involves a lot of expertise in different domains. Let us, for the sake of explaining it in simple terms, start with a short story.

Say Bob wants to buy a car. He visits car dealerships and searches for the car of his preference. He tries to form an opinion on how much he has to pay for the car with the desired specifications. But it would be unwise to buy the first car he sees, so he tries to find many similar ones and then make a decision. In essence, he finds comparable evidence on which to base his opinion on what he can get with the money he is willing to invest, or how much money he will need for his desired car. But then he decides he doesn’t want to buy a car; he just wants to move closer to the beach, so he is selling the house. But what is the price Bob can ask for it? Well, to get a feel, he checks recent trades in his or neighboring streets, maybe recent trades in the same zip code area or the same municipality. The first scenario is that many houses have sold in the last few years, and some of them are very similar to his own, so he concludes that the price of his house should be similar. The second scenario is that there were no recent trades in his surroundings. Well, he remembers the price of the land that he bought and roughly the amount of money invested in building the house (really?). But this was a long time ago. This is not an easy job and Bob may have to ask for the assistance.

As you may have noticed, key terms in this short story are comparable and similar. These come in a package and will stubbornly follow us along our way. Also, apart from being humorous, this story touches on approaches to how real estate valuation can start. Now, let’s get serious for a bit.

From here on, I’ll touch upon terms, definitions, and descriptions from a couple of documents. Specifically, “Pricing to Market” from Nick French and “Comparable Evidence in Real Estate Valuation” from RICS guidance note. Some of those I’ll cite, and some I’ll paraphrase to make them easier to digest. The goal is to bring forward essentials that will be useful in moving forward with the main goal, modeling.

So, without further ado, let’s bravely step into it with few quotes from above mentioned documents.

A comparable can be defined as an item of information used during the valuation process as evidence to support the valuation of another, similar item. Comparable evidence comprises a range of relevant data used by the valuer to support a valuation.
Comparable evidence is at the heart of virtually all real estate valuations. The process of identifying, analyzing, and applying comparable evidence to the real estate to be valued is, therefore, fundamental to producing a sound valuation that can stand scrutiny from the client, the market, and, when necessary, the courts.
Valuation of any asset relies on the well-established economic principle of substitution. This states that the buyer of an item would not pay more for it than the cost of acquiring a satisfactory substitute.

from Comparable Evidence in Real Estate Valuation (RICS guidance note, 1st edition, October 2019.).

The definition of “comparable evidence” is very pertinent when looking at its role in property valuation process. The European Valuation Standards (2016)(EVS) and other international valuation standards tend to use a narrow definition of “comparables” to refer to transactional evidence only. Yet, in practice, valuers use a wide range of comparable evidence (including asking prices, bid information, market indices, etc.) to help them determine the market value of the subject property.

from Pricing to Market (Nick French, June 2020).

This pretty much sums it up and is the place from where we can start. After reading this and rereading our short story, you may end up with a couple of questions:

What happens if we have few to no comparables, both in terms of similarity and recency?
What if we are not able to get all the information for comparables?
What if our comparable evidence is from before some economic and/or demographic change, or, for this purpose, any other big change that potentially affects the market?

Valuers are able to address these and many more scenarios with various approaches. Three commonly used ones in real estate valuation are:

Market approach, in which comparable evidence is based on comparable property market transactions. It is used to value residential, rural, and commercial real estate, as well as land and other assets. For the sake of our goal, this is the approach we will focus on.
Income approach, where real estate is valued by how much income it generates for investors. There are more details here, but since the first approach is going to be the focus, I’ll just go on.
Cost approach, where the value is derived from the cost of the land and the depreciated value of building the same or similar estate. Both of these can be obtained, again, through comparable evidence. This one is also not in our focus, so we I will leave it for some other time.

Regardless of the approach used, market value is the estimated price of real estate if the trade was to be completed on the day of valuation. If we focus on the market approach and comparable market evidence, we can certainly come to the conclusion that there are various types of information with different levels of detail.

Market evidence

The most important things about market evidence are its recency, relevance, and comprehensiveness. Let’s go through some of the sources that can be used:

Direct transactional evidence: here the valuer tries to get exact transaction information, as detailed as it can be. Getting this is really crucial, as it gives the best insight into that particular trade and the market at that point in time. Incentives that can exist in a transaction are very important too, since these can impact the total price of the real estate. These also affect rent, and in some markets, landlords are more keen to use various incentives that make face rent and effective rent different. This is very interesting, and it has to do with real estate that is valued using an income approach.
Publicly available information: information that is published by the government or any authoritative source. These sources can be useful, but they should be used with caution because specific essential parts of the transaction can be left out, for example, sales incentives. This information can possibly be published after a certain period of time, which makes it less useful. Remember, the most recent data is the most valuable.
Published databases: this is frequently the data published by third parties that has been aggregated. Therefore, it is good for analyzing market trends and gathering general information. Since it is aggregated data, it can’t be used as comparable in that sense.
Asking prices: public listings with asking prices should be taken with caution too, since asking prices may possibly differ from the real transaction prices. These could be used by a valuer to analyze market trends.
Historical evidence: all data used as comparables is, in some way, historic. Recent historical data is more valuable than the one that isn’t. Depending on the market dynamics, this threshold varies, but all historic transactions can be valuable in some way. Scenarios where valuers are asked to validate historic valuations are a good example of when historic comparables are useful.
Indices: As previously stated, market trends contribute to our analysis. Sometimes very basic models are created based on this, but since indices are created based on certain aggregations, they can’t be directly used as comparables. Indices can be a bit outdated since they are based on aggregations.
Automated Valuation Methods: believe it or not, AVMs are mentioned as sources of information that should be taken as complementary but can’t give comparable information, i.e., characteristics of individual properties, and therefore the valuer has to decide how to weight this particular source.

It’s worth noting that the information above should be examined thoroughly and double-checked before use. Also, different markets have different levels of data availability and transparency. And while this may seem almost the same, it is not.

Shortfall of comparable evidence

Remember the scenario where Bob couldn’t find similar recent trades? Well, this can happen. If we are talking about certain types of assets or even real estate in certain markets, it can happen that there is no good comparable evidence. This market is categorized as “less active” or “inactive.” Also, the market could be way too dynamic, changing very rapidly, where comparable evidence is quickly becoming outdated. Lack of transparency can also be an issue. This is where thorough details of transactions may not be available. This poses a big challenge since it makes the process of valuation more complex and time-consuming. There are some efforts to measure this level of transparency, like JLL Global Real Estate Transparency Index which is, as they say, “based on a combination of quantitative market data and information gathered through a survey of the global business network of JLL and LaSalle across 94 countries and 156 city markets”.

Top 10 ranked countries based on JLL Global Real Estate Transparency Index, Dec. 2022.

This is giving us an insight into how transparent some markets are in regards to making real estate transaction data available. Anyhow, in less transparent markets, a valuer’s expertise and good judgement become more crucial in the process of valuation, which leads to uncertainty in valuation. Although this can sound like something that lessens the valuer’s contribution and value, it is in fact necessary since the value without some certainty is just a number. Two different real estate properties could be valued the same, but we could be almost certain about one and not so sure about the other. So, some kind of confidence is required. This is even stated in RICS guidance note (1st edition, October 2019., page 18):

valuers should not treat … a statement expressing less confidence in a valuation than usual as an admission of weakness … it is … a matter entirely proper for disclosure’. If client understands that unusual market conditions result in an uncertain valuation it may enable them to make a better-informed business decision.

Conclusions

As data scientists and engineers in the realm of real estate, we should keep in mind a couple of things:

Transaction data is more valuable than public listing data because the bid price is unlikely to be the final price of the sale.
Recent transaction data is more valuable than the one that isn’t.
Similarity between real estate properties is a key point in how comparables are chosen. However, it is not easy to define similarity, even for a valuer, since a lot of information is used in the valuation process.
Some information, like the state of the real estate market or some peculiarities, is not always part of transaction information, and therefore strange variations in transaction prices can appear, which introduces “noise” in data.
Even government data, i.e., direct transactional evidence, can be “unclean” and needs restructuring and cleaning.
There is really not much you can do if the world economy changes market trends, e.g., the recent pandemic or wars, and what was considered comparable evidence is off and not that significant.

From the perspective of a data scientist who works on implementing real estate automated valuation models, I can say there are more conclusions to be made, but I will address some in future articles.

In the next article, we will go into more technical things, describing different types of models that can be used for implementing AVMs and their pros and cons.

Until next time, keep your data sources relevant and your comparables as similar as you can!

Best!

References

Pricing to Market, An Investigation into the use of Comparable Evidence in Property Valuation — Nick French, June 2020
Comparable evidence in real estate valuation — RICS, 1st edition, October 2019