AI Replaces Human Appraisers stardate 2019.420
AI can finally consider all the data that matters, structured and unstructured. This becomes a threat to human appraiser job security.
All the data that matters:
What data actually matters for appraising a property? The number of bedrooms? The number of bathrooms? Absolutely. When it was built? Your neighborhood home prices? That too. There is a long list of things to consider, this is a complicated process for humans and not much has changed with the process for decades.
Something that human appraisers have struggled to consider are all of the unstructured elements on the property. Your landscaping, the view from your porch, the color/style of your neighbor’s fence, the style of your kitchen, etc... Many of these topics have been too “subjective” for influence on your price estimate. Sure, if there are gross quality issues (damaged flooring, etc..) that can go into it, but your choice in tile for the backsplash? The type of rock in the yard? Never. If anything, it may have influenced the property but it was merely the appraiser’s personal bias, not something they could quantify or measure.
Like an alien vacuum from space, AI can suck in ALL of the data now. Literally, everything on the property, structured, unstructured, it doesn’t matter anymore. In addition to that, it can also consider a lot more properties than just the 3–5 reference properties typically used. I’m honestly amazed this role wasn’t replaced sooner based on 2014 technology. Sometimes with industries that are slow to innovate, it takes an outsider to come in and churn the industry with new innovation.
Now in 2019, building hybrid models with structured/unstructured data has become simple enough that an engineer can consider it. A problem that was once a multi-million science project, is now an afternoon curiosity for an engineer. That means, the bar finally got low enough, that this problem became considered low hanging fruit. Uh oh…
Reviewing properties in Utah we have 20,000 properties for sale. Here they are, colored by price percentile (purple=cheap, yellow=$$). You can see the Park City cluster lighting up in all yellow, it is spendy if you want to live 5 min from a ski resort.
Here is a closer look at the Park City (Utah’s Aspen equivalent) cluster. You can even see my house in this view.
What Data Is Available?
Feeding in 1,003 categorical variables (e.g. home style, air conditioner type, zip code, etc…), 280 continuous (# of beds, # of bathrooms, etc…), and the unstructured data (images/text) we can build a holistic model. Then for comparable properties, we can run a shotgun experiment to feed the beast with 1–50 similar properties to see what floats to the top. This adds 294 new continuous variables, bringing the grand total of structured variables to 1577. These groups show comparable properties to the property of interest, some are close by, others are far away:
Building A Model
Considering all of the data that matters we can build a single model using Ziff’s AutoML platform which will automatically build features from structured and unstructured data. This is more of a holistic problem, where we have a lot of different data types being considered.
Without having a domain expert in the loop we are seeing an r-value of 0.98. With actual domain expertise and appetite to cycle through this problem, we would expect to go above 0.99+. Here is a plot of our validation between actual and predicted.
Models without insight aren’t very useful. With every model built there is an insight report that can be reviewed to understand what is going on. This report shows what macro features are driving the prediction.
We can see the image of the realtor on the property offered no lift (0%), but structured offered the most with (52.9%). The text description of the property offered a significant lift at (21%), but not as much as all of the image contributions combined.
Ziff has the only platform right now that can respect multiple image types for the same problem (e.g. images in the home, satellite images, etc…). We had 3 different satellite data resolutions, images in the home, the main image used for the property profile, and the image of all of the realtors involved on the property. It is encouraging to find out that the image of the realtor had no lift, because if it did… what would that mean?
Looking inside the encoder behavior we can look for quality and possible topics. We can see that the larger resolution satellite images were able to capture more consistent groups of high priced features. The argument here is this is probably capturing foothill and water/mountain proximity. Reviewing the topics by opening up the property IDs in these clusters can answer that question.
The encoder portion on the far right shows some large clusters of expensive homes, where the cluster on the far left struggles to find features with big trends. Similarly, looking into the description and the main image (below) on the home we also find clusters between high/low priced neural features.
Similar to the satellite image clusters, we can see that information has been learned from these unstructured elements as well for text and main image. Looking into these clusters gives the data owner strategic insight into what is driving predictions from the unstructured datasets. We haven’t expanded that out in this article, but specific clusters can offer the user very specific topics from the original image/text (or video/audio if that had been included) that a subject matter expert (SME) could react to. Need new risk items for your underwriters to consider? Reviewing embedding clusters is a great approach for finding those.
The top feature importance breakdown seems fairly intuitive for anyone that knows anything about selling a house. The exciting thing with this insight is the unstructured elements (e.g. text description, main image, and satellite image) are showing up as three of our top five drivers.
You can also see the local house comps, as a structured variable, is a really strong feature at #2 and #3. Another one I like is the structured features from local schools. We also have more granular insight with the school data (rating, distance, grade level), but for this report, we have grouped them.
The real value from the model insight with looking into which properties had the highest prediction errors, topics from embeddings, and feature importance ranks is that this insight can cause discussions from the SMEs. The most successful groups are the ones that can iterate on a problem 5–10 times by including or removing features. Inviting a SME to this problem, we would expect the model performance to improve significantly.
There is a huge opportunity right now for a tech company to begin the process of automating home/commercial appraisals. The technology support with AutoML and deep-learning is there now, where this is a <24hr curiosity for an engineer instead of a 7 figure high-risk 12–16-month science project. As these types of problems become low hanging fruit, we will see more job disruption. At first, the jobs will be augmented/validated, and then eventually they will be automated (except appraisals that are predicted to have issues e.g. on a ski resort, etc…). In the end, there is no reason why this wouldn’t be completely automated, I’m thinking sooner than later. If you have worked your entire career as a boots-on-the-ground appraiser the winds of change have started.
But how would you know if you were better than a human?
This is pretty straightforward, you could create a human baseline where you have multiple appraisers appraise the same property and you get an understanding of their inter-rater reliability. Use that number as their error, and demonstrate that AI can go below that. Ideally, the bank/creditor has a better understanding of value than just a human appraiser, but like real-estate, they may be slower to innovate. So maybe even reviewing the concept of home value would be worthwhile, but that is another discussion/blog to get to the bottom.
Determining supply/demand pricing in a volatile housing market is something a whole room full of credit SMEs would have to debate with an AI expert in the room.
So before Tesla goes private for $420/share human appraisers might consider a vacation to Denver before it’s too late.
Ideas? Criticisms? Curiosities? If you comment I will respond.