Computer Vision at Consumer Scale at NoBroker.com

Published in

NoBroker Engineering

6 min readAug 29, 2021

The spectacles of experience; through them you will see clearly a second time
— Henrik Ibsen

NoBroker.com, is the world’s largest brokerage free C2C marketplace. Our core values are centred around making connection between seekers and owners in the most efficient way possible. With technology and Artificial Intelligence we have been solving some of the toughest challenges in the space.

We believe that C2C is solved best by ensuring 3 things —removing middle men, ensuring sanity of listings presented and removing as much information asymmetry as possible between two parties. ML forms the crux of many of the systems that ensure these on NoBroker.com.

A very important aspect of streamlining C2C connections is making sure both the parties have as much information at hand as possible about the entity they interact on. In other words, making sure properties listed on NoBroker.com convey as much information as possible to a tenant, removes a lot of hurdles in his house hunting process.

Property photos form one such important information. A tenant with a given preference of property, makes majority of the judgement about the property from its photos. We have observed properties with informative photos receive as much as 300% higher interest compared to properties with poor or no photos. Needless to say, the primary attention of the user is on photos while browsing on NoBroker.com.

Hence photo upload is a very critical process for us. We process around 1.2 Million property photos on the platform every month. These images are uploaded by owners via multiple channels which includes App, Desktop, Mobile Web etc. Additionally we also allow owners to send images of their properties via WhatsApp. Photos via WhatsApp contribute to around 40% of images we receive on the platform.

We have guidelines to make sure that photos of the properties uploaded are relevant, informative and non abusive. We have observed around 5–8% of the uploads on the platform do not follow our guidelines in listing. These include one or many of the following:

Blurred/Noisy Images
Presence of 3rd party watermarks
Presence of Personal Identification Information (like phone numbers/Address)
Presence of persons/selfies
Accidental uploads from WhatsApp
Empty / non informative images
Explicit content
Screenshots
Poor resolution images

With a large number of images flowing in everyday on the platform, it became important for us that we process them, screen them for the guidelines and upload them on the listing, all in near real time

Hence Iris was born.

Iris is our distributed image processing engine which is a family of micro-services serving computer vision models.

An orchestra of 7 distributed ML services on GKE screen and upload images on NoBroker.com with a 99th percentile upload time of 3 minutes.

Components in IRIS

Iris is constituted of a family of ML models which together decides the final fate of an image on NoBroker.com. Here is an overview of all of them. We will describe them in detail in another blog post very soon.

Watermark Detection:- A deep neural network which does localization + classification of third party watermarks. Technically it is an SSD with Mobilenet V2 trained as a binary classifier on in-house curated dataset.
Object Detection: A Deep Neural Network for Detecting objects in the room. Our model is state of the art which can identify around 40 objects in a typical real estate image. We use this to do room classification, person detection & explicit content flagging. This is again an SSD mobilenet V2 trained indigenous.
Blur Detection: Using a variance of Laplacian over image pixels, we compute a blur metric and threshold it over a cutoff value to decide whether to approve the images or not.
Screenshot Detection: Using image pixel gradient, we identify if the given image contains patterns that look like a screenshot.
OCR: Using OCR models we identify texts in the image. These can be addresses, phone numbers, name boards, or phone branding like “shot on redmi”. We use a naive NLP model to screen those text contents which are following our guidelines and not.
NIMA — Neural Image Assessment: We use googles nima model to rate aesthetic quality of an image. This rating is then used to decide which image is to be shown as the preview picture on a property listing
An Uploader Service: A micro-service for Orientation Correction, Aspect Ratio Correction, NoBroker Logo addition, Compression and Upload.
A state machine: The decision whether or not to upload an image on the platform is decided by a joint rule on the outcome of all the other image models. All of them are independent micro-services processing images independently at their own pace. A state machine works in synergy with all these micro services to decide based on a rule engine, whether or not to upload an image.
An Image Editor: A central micro service with capability to perform necessary edits on the image at the end of assessment. For example; an image failed due to presence of watermark is passed to a watermark removal model. Technically the watermark removal model is a Unet based model for image reconstruction. A pixel statistic based method for screenshot removal also exists in the editor service. We are currently working on other edit services like de-blurring, super-resolution, and text removal with GANs.

IRIS Platform Overview

All of the above components in IRIS are weaved into 7 micro-services. A central interface to IRIS accepts images from NoBroker core back-end. This service fans out the image via RabbitMQ to 6 ML micro-services. We will call these as image assessors. Each of these assessors process the images one by one and stores their assessed state of images (watermark present/blurred/address present etc.) in a central Redis state. A state machine micro-service is pinged by each of these assessor modules at the end of processing. The state machine has rules written which takes an upload + approval decision on the image after all the assessors have returned their respective outputs.

Images which requires edit after processing; for example remove watermark/ crop screenshots etc. are passed to another micro-service which we will call the editor. The editor makes the necessary changes as decided by the state machine and uploads the image back to the listing.

IRIS at Scale

All of the above components sit in an orchestrated fashion in Google Cloud Kubernetes. The assessor modules in the centre operate as celery workers subscribing from a RabbitMQ queue. These assessor modules run as kubernetes pods with HPA configured on RabbitMQ queue size.

To get this in perspective, the iris family scales upto 24 replicas on GKE during peak hours. During non peak hours the system scales down to bare minimal of 7 replicas. All this with zero manual interference.

The system screens all the images coming on the platform and uploads them automatically within 5 seconds — 3 minutes from the time a user uploads on the platform.

Any sufficiently advanced technology is indistinguishable from magic.
— Arthur C Clarke

Each of the components in iris is a blog post of its own. We will be posting more about them very soon in this publication. We will also be open sourcing some of them. Watch this space for more of the amazing things that wake us up every day at nobroker.com

Computer Vision at Consumer Scale at NoBroker.com

Components in IRIS

IRIS Platform Overview

IRIS at Scale

Written by NoBroker.com