How we built our image classifier

Sam Stone
Tech @ Domain
Published in
4 min readSep 14, 2018

The Emerging Tech team at Domain have been working on some pretty cool stuff lately- experimenting, improving, scaling and productionising our Image Classification AI.

It all started with the problem we were trying to solve — we had a lot of property photos but none of them were labeled as a bedroom, dining room, bathroom….you get the picture. Also, what about outdoor spaces, like balcony, courtyard, car space, yard!

So, how could we label our property photos and combine them with our Interactive Floorplan on the Domain Mobile App? Oh and would it be of value to our users?

We hit the ground running and to make sure we had a quick start and a fast iteration, we experimented, improved and then scaled with a couple of ideas before hitting our nirvana with our own custom model.

Google Vision API & Word Vector

We kicked off our experimentation with a two-step approach using Google Vision API and Word Vector. We utilised Google Vision API to turn each image into text labels and the Word Vector model to classify the text labels into rooms types.

This was a great first start as it cut our image classifier training time from days to minutes. However, we were limited by the Google Vision API as images that had no furniture (empty room images) did not have constructive text labels and our results suffered.

The main benefit of using Google Vision API was that it was an off the shelf solution and as it processes text not actual images the computation was less intensive. As we wanted to see results fast it was an easy solution. However, it wasn’t scalable.

Google AutoML

After A/B testing for a couple of weeks, the user metrics were very positive so we moved on to using Google AutoML. As a managed cloud service it allowed us to define our own labels and prepare our own dataset for training.

At this stage we had over 200,000 labeled images so it was time for a real end-to-end image classifier. Using our collected dataset from Google Vision API we started doing our own manual labeling and ended up pulling 35,000 images together in a really short period of time.

Once the labeling was completed, we sent them to Google AutoML and were able to classify our empty room images. During this time we also kept our Vision + Vector model running so we could compare the performance of both.

The benefit of Google AutoML was that it is a pre-built model. We just had to provide the dataset of photos along with the labels and it provided us a working model. It did all the heavy-lifting for us, however, we couldn’t fine tune it for our own needs, we were dependent on Google to improve the accuracy.

Custom Model

After another three to four weeks we had collected over three million labeled images and decided it was time to rollout our own model.

On the road to finding our own model we decided to train several in parallel with the final model ensembling each retrained models’ output in a fully connected layer.

Comparing this with the end-to-end ensemble solution, retraining each classifier first, then training the ensemble model in a separate step reduced our training time from more than a week to couple of days!

We had also reached an agreeable accuracy compared to Google AutoML with one key difference — we added a lot of corrected images sets (previously wrongly predicted by AutoML) into our custom training sets.

Having a model we can trust is awesome, so we moved onto productionising it and processing the inference in real time. The setup we created also enabled us to incorporate and test new models much easier. Another benefit is that we have complete control over the model and it’s accuracy.

And get this — across the whole of Domain Australia listings we have made over 4 million classifications on property listings have been mapped since the new custom model was deployed. This data is already showing its value and potential and of course growing every day.

You can check out the Interactive Floorplan before & afters at https://www.youtube.com/watch?v=ucElWoAP6go

We are super excited about the things we are developing with Machine Learning at Domain, if you are a passionate and driven ML engineer, why not drop us a line as we are hiring! We are sure there is more than one project that would interest you — https://www.linkedin.com/company/domain-com-au/jobs/

Words by John DiZhang

--

--

Sam Stone
Tech @ Domain

"Just me, trying to be". Agile. Coffee Lover. Writing Enthusiast. Sports fan. Sunshine Devotee. Book Worm.