Zillow AI Forum 2019

Published in

Zillow Tech Hub

9 min readOct 17, 2019

by Aaron Wroblewski, AI Engineering Manager, Personalization AI, Zillow.

For any company today, successfully leveraging data to improve business operations and customer experience is existential, and at Zillow we are using AI to reimagine the way we buy, sell, and rent homes.

At the annual Zillow AI Forum we share that work with colleagues and invite industry thought leaders and academics to share their achievements. Together, through a day of learning and networking, we are contributing to a community capable of solving the world’s biggest challenges. This year we hosted speakers from Harvard, Microsoft, Samsung, Unity, and more. We explored what’s new in the field of AI, covering topics like computer vision & image recognition, human-in-the-loop, conversational AI & search, and how video games are driving advancements in AI.

In this post, I’ll share my perspective on each of the talks and highlight key points that excited, motivated, and challenged me.

Privacy in AI

As an innovator working to reach the cutting edge of the area in which I work, I find it incredibly motivating to remain connected to the downstream impacts of my work, for instance, I dream of creating delightful personalized experiences that engage customers and enable them to more easily navigate the process of finding their way home. One talk in particular reminded me of the need to remain connected to the potential risks inherent in this work, in this case the need to continually evaluate the risk to privacy.

Sing Bing Kang, Distinguished Scientist in 3D and Computer Vision (CV) at Zillow Group, shared work from Microsoft Research exposing a new risk to privacy in augmented reality scenarios. Some innovative augmented reality solutions use innovative technologies like time of flight cameras to capture data from real-world locations. The data collected from these technologies can be used to create incredible experiences, including rendering animated, artificial objects over real-world scenes to create a new level of realism in computing, and enabling us to interact with artificial agents in a more natural way.

8 3D point-clouds with depth, SIFT, and RGB data. — *Revealing Scenes by Inverting Structure from Motion Reconstructions*

Sing shared the importance of protecting these data representing 3D point clouds as his research proves that these data can be used to accurately reconstruct visually accurate images, video and 3D models of scenes from the point clouds. The resulting reconstructed scenes are striking, shown below.

*A 3D point-cloud, generated video, and original video recorded with the point cloud. The generated video shows that specific details including images and text can be reconstructed.*

Sing showed how the scenes were reconstructed using a series of convolutional neural networks, and how the scenes could be reconstructed accurately with only point clouds, and how accuracy increases with additional features like optional color data and SIFT descriptors. Read more about this research here.

An architecture of VisibNet, CoarseNet, and RefineNet can be used to generate a photorealistic image from a 3D point cloud. — *Network Architecture used to reconstruct images, videos, and 3D scenes from point clouds.*

Thankfully Sing shares a privacy preserving solution — lifting 3D point clouds in 3D line clouds here. This solution retains the ability to reconstruct basic structure of the scene without enabling exposure of confidential information.

Document Understanding

Isn’t recognizing text from scanned documents (commonly referred to as optical character recognition or OCR) a solved problem? It’s still a thriving research area, according to Cha Zhang, Partner Engineering Manager at Microsoft. Legacy solutions rely on crisp, printed documents scanned at high resolution. This doesn’t address the needs we have today: commodity cameras capturing crumpled and dynamic documents from long range, in low resolution, and with handwriting.

Cha gave impressive demos garnering crisp applause as he showed the capabilities of Microsoft new generation OCR engine, the Read API. Using his Microsoft Surface, he captured live images of his own handwriting and crumpled resumes from the audience, and his solutions quickly and reliably converted the images into text.

Cha Zhang demonstrates OCR of printed and handwritten text an in image live-captured from his Surface tablet. — *Seriously impressive handwriting and printed text OCR demos received enthusiastic applause.*

Cha shared details on how the Microsoft solution is implemented, included 3 key stages: 1. Region Proposal, 2. Line Grouping, and 3. a convolutional neural network (CNN) and deep bidirectional long short-term memory (DBLSTM) architecture for recognizing characters.

Anchor-free region proposal network includes 4 convolutional layers, with 3 connected to densebox detectors. — *Anchor-Free Region Proposal uses context to enable more accurate grouping of characters.*

Check out the Vision related Azure Cognitive Services to learn more about how you can leverage these tools today, including the tools described above and others to recognize structured data from forms.

Customer Retention and Churn

My team’s mission is to support Zillow in understanding user’s intent and personalizing every customer experience. That’s why I was super excited for Eva Ascarza’s (Associate Professor, Harvard University) talk about how we can predict customer churn and identify actions we can take to more-likely retain customers.

Eva shared how ML and AI have driven fast progress in this area through creation of data, new methods in prediction and estimation and through personalization and experimentation. However, she noted that many companies develop predictive churn models, and think the work of understanding which customers to target is done. This is too early, she argues, and prevents companies from learning which interventions they execute on actually drive impact in retention, creating the risk that interventions end up increasing churn rates.

2 graphs comparing results of interventions for top risk and top receptive customer populations from Company A and B. — *Company A’s intervention increased retention among those customers most likely to churn, which Company B’s intervention unintentionally drove additional churn.*

The lesson here is that companies sometimes examine experimental results over the entire population, and that this can unfortunately cause us to miss the forest for the trees. In the example shared, customers were more likely to churn responded very differently from customers most responsive to our intervention. Companies generate tangible learning when they examine experimental results in the context of the hypothesis, in this case, that a given intervention will prevent high risk customers from churning.

2 line graphs showing the relationship of risk and lift for both days since transaction and data consumption. — *“Predictors are not drivers,” says Eva, reminding us to examine intervention results on segments of customers, for instance, the most likely to churn and least likely to churn.*

Eva ended her talk with the question “are we asking the right questions,” reminding us that the power of being able to create right answers comes with the responsibility to ensure we understand if we’re asking the right questions.

AI at Zillow Group

AI Platform

Sheena Jain, Machine Learning Manager, shared the business case and high-level architecture for Zillow’s AI Platform, currently under development. Sheena’s team, responsible for Home Valuation and the Zestimate, alone trains, evaluates and deploys more than thirty-thousand models daily. As our AI org grows, more and more teams need the tools to make this scale of model exploration and deployment feasible. By leveraging popular, open source technologies, while creating custom solutions to solve Zillow’s unique requirements, Sheena sees a clear opportunity to share engineering investment across teams to create these benefits.

workflow of the ideal lifecycle of developing ML models: manage data, train models, evaluate, deploy, predict and monitor. — *Sheena Jain shares her vision for an AI Platform that enables more efficient development of AI applications.*

Sheena says, “Most of the time building AI solutions is spent on data and feature engineering — converting raw data into useful features.” With data and feature catalogs in house, Zillow AI is empowered to work together to reduce the effort to build new AI applications over time.

Human in the Loop

David Fagnan, Director of Applied Science, shared a clear vision for the value of Explainable AI (XAI). The importance of ensuring that we understand and correct for the bias in algorithmic decisions continues to echo in the public conversation, and governments around the globe have enacted and are considering laws to grant citizens with the right to explanations of automated decision. David shares XAI as a tool to enable implementers of AI applications to fulfill these needs, to help train more accurate models, and to create more delightful interactions between the humans and the AI applications that they depend on to do their jobs.

*David demonstrates XAI, showing how areas of an image influenced algorithmic decisions.*

David shared how Zillow Offers, a service makes it seamless to sell your home without prep, enabling you to save time and choose your move date, uses XAI to train home valuation models by learning from questions that human evaluators ask about automated home valuations, and how this iterative process can create models with more accurate predictions and higher coverage. I’m looking forward to an in-depth blog post on this topic from David, coming soon.

Computer Vision

Shourabh Rawat, Sr. Manager, Applied Science reviewed how Zillow is using Computer Vision to generate 3D tours of homes, and shared details of how thumbnails are automatically generated for each room. Attractiveness is one of the factors that is used to rank candidate images, and Shourabh shared how his team trained convolutional neural networks to visualize saliency in predicting attractiveness, another application of XAI.

*CNN architecture used to visualize saliency for attractiveness.*

Conversational AI

Adam Cheyer, Co-Founder and VP Engineering, Viv Labs (now Samsung Bixby) and Siri, told our AI Forum audience that “Every 10 years the way we interact with computers changes.” He says that for a computing mode to reach this “Paradigm scale, every connected user and every connected business [must] drive significant value from that technology.” No conversational interface has yet reached this paradigm scale, and Adam argues that to get there, the best assistant will have to deliver on the customer need for a single assistant, with a consistent experience, on any device, personalized for every user.

In his work on Samsung Bixby, his vision is to “enable the work to build this best assistant,” and he shared an impressive demo of how Bixby’s developer tools enable AI assisted development of multi-modal (visual and voice) experiences (coined “capsules”) for Samsung devices. Using these tools, Human developers enter natural language utterances that they expect humans to speak when interacting with their capsule, then annotate entities with meaning providing clues about which terms dictate the capsule, the intention, and the details. Then AI generates logic to enable recognition of more dynamic utterances based on the instructions from the developer. For the first time, I saw how AI could augment and enhance the software development experience, and exciting vision.

A screenshot showing the bixby developer studio. — *Bixby Studio demo showing an Human+AI authored logic tree.*

Adam also shared impressive demonstrations of the personalization capabilities of the platform, showing how it can discover and retain user preferences for car sharing services, for example, and how it allows users to inspect and update those preferences. Learn more about developing Bixby capsules at https://bixbydevelopers.com.

AI in Video Games

Danny Lange (VP of AI and ML, Unity Technologies) shared an overview of many fun and thought provoking examples of how video games can be used to simulate the physics of the real world to enable quickly training AI to perform various tasks. Scientists have created simulations enabling bipedal agents to learn to walk, soccer playing agents to learn to cooperate, and cute cartoon corgi agents to chase treats efficiently.

In this simulation, dogs learn to zig into the grass and then zag out in front of each other to reach the bones.

I had seen similar simulations before, and I found them super impressive and exciting. However, I feel they also highlight, that as much as humans have progressed in developing intelligent agents that can successfully perform tasks, we haven’t begun to close in on the dream of Artificial Intelligence. For artificial agents to be considered intelligent, we might expect them to communicate, to cooperate, to complete tasks and also to identify and cooperate. Frankly, it’s quite hard for me to imagine how we might get from the algorithms and hardware we use today, to achieving that vision. Danny however, is quite optimistic. He notes that the human brain uses only 20 watts of energy to execute all of the intelligence humans can express. The human brain has has stable DNA for the last 100,000 years, and during this time, we’ve developed language, the ability to collaborate, and anticipation.

The human brain hasn’t evolved considerably in 100,000 years, but during that same time we developed advanced algorithms: language, collaboration and anticipation.

Danny’s message is optimistic about the opportunity of developing more efficient task specific algorithms, about combining those algorithms to create more intelligence-like applications, and about the possibility of doing so without huge advances in hardware.

Looking Forward to 2020

The second annual Zillow AI forum was an incredible day of learning and connecting with colleagues across the industry. My team and I were so excited to discuss how we might apply what we heard to our work in Personalization.

What should we discuss at next year’s AI Forum? If you or your company have exciting work to share at next year’s forum, please send us an email at zgtechmarketing@zillowgroup.com with ‘AI Forum 2020’ in the subject line, and stay tuned for an in-depth look at Human in the Loop at Zillow Offers from David Fagnan in the coming weeks.